Aos Notes
Aos Notes
UNIT I: Overview of UNIX system calls. The anatomy of a system call and x86
mechanisms for system call implementation. How the MMU/memory translation,
segmentation, and hardware traps interact to create kernel–user context separation.
What makes virtualization work? The kernel execution and programming context.
Live debugging and tracing. Hardware and software support for debugging.
UNIT III: Process and thread kernel data structures, process table traversal, lookup,
allocation and management of new structures, /proc internals, optimizations. Virtual
File System and the layering of a file system call from API to driver. Object-
orientation patterns in kernel code; a review of OO implementation generics (C++
vtables, etc).
UNIT IV: OpenSolaris and Linux virtual memory and address space structures. Tying
top-down and bottom-up object and memory page lookups with the actual x86 page
translation and segmentation. How file operations, I/O buffering, and swapping all
converged to using the same mechanism. Kmem and Vmem allocators. OO
approach to memory allocation. Challenges of multiple CPUs and memory hierarchy.
Security: integrity, isolation, mediation, auditing. From MULTICS and MLS to modern
UNIX. SELinux type enforcement: design, implementation, and pragmatics. Kernel
hook systems and policies they enable. Trap systems and policies they enable.
Tagged architectures and multi-level UNIX.
UNIX system calls are used to manage the file system, control processes,and to provide interprocess
communication. The UNIX system interface consists of about 80 system calls (as UNIX evolves this
number will increase). The following table lists about 40 of the more important system call:
[NOTE: The system call interface is that aspect of UNIX that has changed the most since the
inception of the UNIX system. Therefore, when you write a software tool, you should protect that tool
by putting system calls in other subroutines within your program and then calling only those
subroutines. Should the next version of the UNIX system change the syntax and semantics of the
system calls you've used, you need only change your interface routines.]
The anatomy of a System call and x86 mechanisms for system call
implementation.
NaCl syscalls are the interface between untrusted code and the trusted codebase. They are the
means by which a NaCl process can execute code outside the inner sandbox. This is kind of a big
deal, because the entire point of NaCl is to prevent untrusted code from getting out of the inner
sandbox. Accordingly, the design and implementation of the syscall interface is a crucial part of the
NaCl system.
The purpose of a syscall is to transfer control from an untrusted execution context to a trusted one, so
that the thread can execute trusted code. The details of this implementation vary from platform to
platform, but the general flow is the same. This figure shows the flow of control:
The syscall starts as a call from untrusted code to a trampoline, which is a tiny bit of code (less than
one NaCl bundle) that resides at the bottom of the untrusted address space. Each syscall has its own
trampoline, but all trampolines are identical--in fact, they're all generated by the loader from a simple
template. The trampoline does at most two things:
1. Exits the hardware sandbox (on non-SFI implementations) by restoring the original system value of
%ds.
2. Calls the untrusted-to-trusted context switch function (NaClSyscallSeg)
The many ways of implementing user-to-kernel transitions on x86, i.e. system calls. Let’s first quickly
review what system calls actually need to accomplish.
In modern operating systems there is a distinction between user mode (executing normal application
code) and kernel mode1 (being able to touch system configuration and devices). System calls are the
way for applications to request services from the operating system kernel and bridge the gap. To
facilitate that, the CPU needs to provide a mechanism for applications to securely transition from user
to kernel mode.
Secure in this context means that the application cannot just jump to arbitrary kernel code, because
that would effectively allow the application to do what it wants on the system. The kernel must be able
to configure defined entry points and the system call mechanism of the processor must enforce these.
After the system call is handled, the operating system also needs to know where to return to in the
application, so the system call mechanism also has to provide this information.
I came up with four mechanisms that match this description that work for 64-bit environments. I’m
going to save the weirder ones that only work on 32-bit for another post. So we have:
Software interrupts are the oldest mechanism. The key idea is to use the same method to enter the
kernel as hardware interrupts do. In essence, it is still the mechanism that was introduced
with Protected Mode in 1982 on the 286, but even the earlier CPUs already had cruder versions of
this.
Because interrupt vector 0x80 can still be used to invoke system calls 2 on 64-bit Linux, we are going
to stick with this example:
The processor finds the kernel entry address by taking the interrupt vector number from
the int instruction and looking up the corresponding descriptor in the Interrupt Descriptor Table (IDT).
This descriptor will be an Interrupt or Trap Gate3 to kernel mode and it contains the pointer to the
handling function in the kernel.
Note: Architecture is a behavioural specification. The caches must behave as if they are physically
tagged. An implementation might do something different, as long as this is not software-visible.
Table entry
The translation tables work by dividing the virtual address space into equal-sized blocks and by
providing one entry in the table per block.
Entry 0 in the table provides the mapping for block 0, entry 1 provides the mapping for block 1, and so
on. Each entry contains the address of a corresponding block of physical memory and the attributes
to use when accessing the physical address.
Table lookup
A table lookup occurs when a translation takes place. When a translation happens, the virtual address
that is issued by the software is split in two, as shown in this diagram:
This diagram shows a single-level lookup.
The upper-order bits, which are labelled 'Which entry' in the diagram, tell you which block entry to look
in and they are used as an index into the table. This entry block contains the physical address for the
virtual address.
The lower-order bits, which are labelled 'Offset in block' in the diagram, are an offset within that block
and are not changed by the translation.
Multilevel translation
In a single-level lookup, the virtual address space is split into equal-sized blocks. In practice, a
hierarchy of tables is used.
The first table (Level 1 table) divides the virtual address space into large blocks. Each entry in this
table can point to an equal-sized block of physical memory or it can point to another table which
subdivides the block into smaller blocks. We call this type of table a 'multilevel table'. Here we can see
an example of a multilevel table that has three levels:
In Armv8-A, the maximum number of levels is four, and the levels are numbered 0 to 3. This multilevel
approach allows both larger blocks and smaller blocks to be described. The characteristics of large
and small blocks are as follows:
Large blocks require fewer levels of reads to translate than small blocks. Plus, large blocks are more
efficient to cache in the TLBs.
Small blocks give software fine-grain control over memory allocation. However, small blocks are less
efficient to cache in the TLBs. Caching is less efficient because small blocks require multiple reads
through the levels to translate.
To manage this trade-off, an OS must balance the efficiency of using large mappings against the
flexibility of using smaller mappings for optimum performance.
Note: The processor does not know the size of the translation when it starts the table lookup. The
processor works out the size of the block that is being translated by performing the table walk.
Segmentation
In Operating Systems, Segmentation is a memory management technique in which the memory is
divided into the variable size parts. Each part is known as a segment which can be allocated to a
process.
The details about each segment are stored in a table called a segment table. Segment table is stored
in one (or many) of the segments.
Till now, we were using Paging as our main memory management technique. Paging is more close to
the Operating system rather than the User. It divides all the processes into the form of pages regardless
of the fact that a process can have some relative parts of functions which need to be loaded in the same
page
Operating system doesn't care about the User's view of the process. It may divide the same function
into different pages and those pages may or may not be loaded at the same time into the memory. It
decreases the efficiency of the system.
It is better to have segmentation which divides the process into the segments. Each segment contains
the same type of functions such as the main function can be included in one segment and the library
functions can be included in the other segment.
1. Segment Number
2. Offset
For Example:
Suppose a 16 bit address is used with 4 bits for the segment number and 12 bits for the segment offset
so the maximum segment size is 4096 and the maximum number of segments that can be refereed is
16.
When a program is loaded into memory, the segmentation system tries to locate space that is large
enough to hold the first segment of the process, space information is obtained from the free list
maintained by memory manager. Then it tries to locate space for other segments. Once adequate space
is located for all the segments, it loads them into their respective areas.
The operating system also generates a segment map table for each program.
With the help of segment map tables and hardware assistance, the operating system can easily
translate a logical address into physical address on execution of a program.
The Segment number is mapped to the segment table. The limit of the respective segment is compared
with the offset. If the offset is less than the limit then the address is valid otherwise it throws an error as
the address is invalid.
In the case of valid addresses, the base address of the segment is added to the offset to get the physical
address of the actual word in the main memory.
The above figure shows how address translation is done in case of segmentation.
Advantages of Segmentation
1. No internal fragmentation
3. Less overhead
5. The segment table is of lesser size as compared to the page table in paging.
Disadvantages
User Mode
The system is in user mode when the operating system is running a user application such as handling
a text editor. The transition from user mode to kernel mode occurs when the application requests the
help of operating system or an interrupt or a system call occurs.
The mode bit is set to 1 in the user mode. It is changed from 1 to 0 when switching from user mode to
kernel mode.
Kernel Mode
The system starts in kernel mode when it boots and after the operating system is loaded, it executes
applications in user mode. There are some privileged instructions that can only be executed in kernel
mode.
These are interrupt instructions, input output management etc. If the privileged instructions are executed
in user mode, it is illegal and a trap is generated.
The mode bit is set to 0 in the kernel mode. It is changed from 0 to 1 when switching from kernel mode
to user mode.
An image that illustrates the transition from user mode to kernel mode and back again is −
In the above image, the user process executes in the user mode until it gets a system call. Then a
system trap is generated and the mode bit is set to zero. The system call gets executed in kernel mode.
After the execution is completed, again a system trap is generated and the mode bit is set to 1. The
system control returns to kernel mode and the process execution continues.
Necessity of Dual Mode (User Mode and Kernel Mode) in Operating System
The lack of a dual mode i.e user mode and kernel mode in an operating system can cause serious
problems. Some of these are −
A running user program can accidentaly wipe out the operating system by overwriting it with
user data.
Multiple processes can write in the same system at the same time, with disastrous results.
Operating system-based Virtualization refers to an operating system feature in which the kernel
enables the existence of various isolated user-space instances. The installation of virtualization
software also refers to Operating system-based virtualization. It is installed over a pre-existing
operating system and that operating system is called the host operating system.
In this virtualization, a user installs the virtualization software in the operating system of his system
like any other program and utilizes this application to operate and generate various virtual
machines. Here, the virtualization software allows direct access to any of the created virtual
machines to the user. As the host OS can provide hardware devices with the mandatory support,
operating system virtualization may affect compatibility issues of hardware even when the hardware
driver is not allocated to the virtualization software.
Virtualization software is able to convert hardware IT resources that require unique software for
operation into virtualized IT resources. As the host OS is a complete operating system in itself,
many OS-based services are available as organizational management and administration tools can
be utilized for the virtualization host management.
The Operating system may have the capability to allow or deny access to such resources based on
which the program requests them and the user account in the context of which it runs. OS may also
hide these resources, which leads that when a computer program computes them, they do not
appear in the enumeration results. Nevertheless, from a programming perspective, the computer
program has interacted with those resources and the operating system has managed an act of
interaction.
With operating-system-virtualization or containerization, it is probable to run programs within
containers, to which only parts of these resources are allocated. A program that is expected to
perceive the whole computer, once run inside a container, can only see the allocated resources and
believes them to be all that is available. Several containers can be formed on each operating
system, to each of which a subset of the computer’s resources is allocated. Each container may
include many computer programs. These programs may run parallel or distinctly, even interrelate
with each other.
Operating system-based virtualization can raise demands and problems related to performance
overhead, such as:
1. The host operating system employs CPU, memory, and other hardware IT resources.
2. Hardware-related calls from guest operating systems need to navigate numerous layers to and
from the hardware, which shrinkage overall performance.
3. Licenses are frequently essential for host operating systems, in addition to individual licenses
for each of their guest operating systems.
Kernel execution
The kernel execution configuration defines the dimensions of a grid and its blocks. Unique
coordinates in blockIdx and threadIdx variables allow threads of a grid to identify themselves and their
domains of data. It is the programmer’s responsibility to use these variables in kernel functions so that
the threads can properly identify the portion of the data to process. This model of programming
compels the programmer to organize threads and their data into hierarchical and multidimensional
organizations.
In the dictionary a kernel is a softer, usually edible part of a nut, seed, or fruit stone contained within its
shell such as “the kernel of a walnut”. It can also be the central or most important part of something
“this is the kernel of the argument”.
In computing the kernel is a computer program that is the core of a computer’s operating system,
with complete control over everything in the system.
The kernel is often one of the first programs loaded up on start-up before the boot loader.
“A boot loader is a type of program that loads and starts the boot time tasks and processes of an
operating system or the computer system. It enables loading the operating system within the computer
memory when a computer is started or booted up. A boot loader is also known as a boot manager or
bootstrap loader.”
You have probably heard the expression of ‘booting up’ a system. The bootloader translates the data-
processing instructions for the central processing unit. The bootloader handles memory and
peripherals like keyboards, monitors and speakers.
The boot system for all standard computers and operating systems — image by Neosmart retrieved
the 27th of September.
I had an inkling that the kernel was important as part of the computer system operation, however I was
unsure of how it operates. As such I found more information about the Linux Kernel in particular.
Demystifying the Linux Kernel from Digilent blog retrieved the 27th of September.
“…the kernel is a barrier between applications, CPU, memory, and devices. Applications are what
people use all the time, with everything from video games to the Internet”
The Linux kernel is a free and open-source, monolithic, Unix-like operating system kernel. This can be
represented as such.
Of course likely simplified the advantages are as follows.
Since there is less software involved it is faster.
As it is one single piece of software it should be smaller both in source and compiled forms.
Less code generally means fewer bugs which can translate to fewer security problems.
All OS services run along with the main kernel thread, thus also residing in the same memory area.
The main disadvantages of monolithic kernels are:
The dependencies between system components — a bug in a device driver might crash the entire
system
Large kernels can become very difficult to maintain.
Most work in the monolithic kernel is done via system calls.
A system call is a way for programs to interact with the operating system. A computer program makes
a system call when it makes a request to the operating system’s kernel. System call provides the
services of the operating system to the user programs via Application Program Interface(API).
In an article on the geek stuff the interaction between the computer hardware, OS Kernel, System
Functions, Application Code and Library Functions:
Application code used in one environment and can be changed to alter the behaviour. As an example
of the difference, I’ll implement a sample logging mechanism. One written as application code and one
written as library code.
As such the difference between a microkernel and a monolithic kernel lies within the system calls as
well as the ‘kernel space’.
Image retrieved from Tech Difference on the 28th of September. A more detailed explanation can be
found at https://techdifferences.com/difference-between-microkernel-and-monolithic-kernel.html
The main differences were listed as the following:
1. The basic point on which microkernel and monolithic kernel is distinguished is that microkernel
implement user services and kernel services in different address spaces and monolithic kernel
implement both user services and kernel services under same address space.
2. The size of microkernel is small as only kernel services reside in the kernel address space.
However, the size of monolithic kernel is comparatively larger than microkernel because both
kernel services and user services reside in the same address space.
3. The execution of monolithic kernel is faster as the communication between application and
hardware is established using the system call. On the other hands, the execution of microkernel is
slow as the communication between application and hardware of the system is established
through message passing.
4. It is easy to extend microkernel because new service is to be added in user address space that is
isolated from kernel space, so the kernel does not require to be modified. Opposite is the case
with monolithic kernel if a new service is to be added in monolithic kernel then entire kernel needs
to be modified.
5. Microkernel is more secure than monolithic kernel as if a service fails in microkernel the operating
system remain unaffected. On the other hands, if a service fails in monolithic kernel entire system
fails.
6. Monolithic kernel designing requires less code, which further leads to fewer bugs. On the other
hands, microkernel designing needs more code which further leads to more bugs.
Software Debugging : Debugging is the process of detecting and removing of existing and
potential errors (also called as ‘bugs’) in a software code that can cause it to behave unexpectedly or
crash. To prevent incorrect operation of a software or system, debugging is used to find and resolve
bugs or defects. When various subsystems or modules are tightly coupled, debugging becomes
harder as any change in one module may cause more bugs to appear in another. Sometimes it takes
more time to debug a program than to code it.
Description: To debug a program, user has to start with a problem, isolate the source code of the
problem, and then fix it. A user of a program must know how to fix the problem as knowledge about
problem analysis is expected. When the bug is fixed, then the software is ready to use. Debugging
tools (called debuggers) are used to identify coding errors at various development stages. They are
used to reproduce the conditions in which error has occurred, then examine the program state at that
time and locate the cause. Programmers can trace the program execution step-by-step by evaluating
the value of variables and stop the execution wherever required to get the value of variables or reset
the program variables. Some programming language packages provide a debugger for checking the
code for errors while it is being written at run time.
Here’s the debugging process:
2. Describe the bug. Try to get as much input from the user to get the exact reason.
3. Capture the program snapshot when the bug appears. Try to get all the variable values and states
of the program at that time.
4. Analyse the snapshot based on the state and action. Based on that try to find the cause of the bug.
5. Fix the existing bug, but also check that any new bug does not occur.
UNIT II
DTrace
DTrace is a comprehensive dynamic tracing framework originally created by Sun
Microsystems for troubleshooting kernel and application problems on production systems in real time.
Originally developed for Solaris, it has since been released under the free Common Development and
Distribution License (CDDL) in OpenSolaris and its descendant illumos, and has been ported to
several other Unix-like systems.
DTrace can be used to get a global overview of a running system, such as the amount of memory,
CPU time, filesystem and network resources used by the active processes. It can also provide much
more fine-grained information, such as a log of the arguments with which a specific function is being
called, or a list of the processes accessing a specific file.
DTrace scripts can be invoked directly from the command line, providing one or more probes and
actions as arguments. Some examples:
Scripts can also be written which can reach hundreds of lines in length, although typically only tens of
lines are needed for advanced troubleshooting and analysis.
DTrace Programming
When you use the dtrace command, you invoke the compiler for the D language. Once DTrace has
compiled your program, it sends it to the operating system kernel for execution, where it activates the
probes that your program uses.
DTrace enables probes only when you are using them. No instrumented code is present for inactive
probes, so your system does not experience performance degradation when you are not using
DTrace. Once your D program exits, all of the probes it used are automatically disabled and their
instrumentation is removed, returning your system to its original state. No effective difference exists
between a system where DTrace is not active and one where the DTrace software is not installed.
DTrace implements the instrumentation for each probe dynamically on the live, running operating
system. DTrace neither quiesces nor pauses the system in any way, and it adds instrumentation code
only for the probes that you enable. As a result, the effect of using DTrace probes is limited to exactly
what you ask DTrace to do. DTrace instrumentation is designed to be as efficient as possible, and
enables you to use it in production to solve real problems in real time.
The DTrace framework provides support for an arbitrary number of virtual clients. You can run as
many simultaneous D programs as you like, limited only by your system's memory capacity, and all
the programs operate independently using the same underlying instrumentation. This same capability
also permits any number of distinct users on the system to take advantage of DTrace simultaneously
on the same system without interfering with one another.
Unlike a C or C++ program, but similar to a Java program, DTrace compiles your D program into a
safe intermediate form that it executes when a probe fires. DTrace validates whether this intermediate
form can run safely, reporting any run-time errors that might occur during the execution of your D
program, such as dividing by zero or dereferencing invalid memory. As a result, you cannot construct
an unsafe D program. You can use DTrace in a production environment without worrying about
crashing or corrupting your system. If you make a programming mistake, DTrace disables the
instrumentation and reports the error to you.
illustrates the different components of the DTrace architecture, including probe providers, the DTrace
driver, the DTrace library, and the dtrace command.
Users interact with DTrace through the DTrace command, which serves as a front-end to the DTrace
engine. D scripts get compiled to an intermediate format (DIF) in user-space and sent to the DTrace
kernel component for execution, sometimes called as the DIF Virtual Machine. This runs in the
dtrace.sys driver.
Traceext.sys (trace extension) is a Windows kernel extension driver, which allows Windows to expose
functionality that DTrace relies on to provide tracing. The Windows kernel provides callouts during
stackwalk or memory accesses which are then implemented by the trace extension.
1. Check that you are running a supported version of Windows. The current download of DTrace
is supported in the Insider builds of 20H1 Windows after version 18980 and Windows Server
Insider Preview Build 18975. Installing this version of DTrace on older versions of Windows can
lead to system instability and is not recommended.
The archived version of DTrace for 19H1 is available at Archived Download DTrace on
Windows. Note that this version of DTrace is no longer supported.
2. Download the MSI installation file (Download DTrace on Windows) from the Microsoft
Download Center.
3. Select the Complete install.
Important
Before using bcdedit to change boot information you may need to temporarily suspend
Windows security features such as Patchguard, BitLocker and Secure Boot on the test PC. Re-
enable these security features when testing is complete and appropriately manage the test PC,
when the security features are disabled.
Internals
A simple illustration of how DTrace internals work.
root@openindiana:/home/sergey# mdb -k
> getpid::dis
getpid: pushq %rbp
getpid+1: movq %rsp,%rbp
getpid+4: subq $0x10,%rsp
getpid+8: movq %gs:0x18,%rax
getpid+0x11: movq 0x190(%rax),%r9
getpid+0x18: movq 0xb0(%r9),%r8
getpid+0x1f: movl 0x4(%r8),%eax
getpid+0x23: movl %eax,-0x8(%rbp)
getpid+0x26: testl $0x400,0xcc(%r9)
getpid+0x31: jne +0x9 <getpid+0x3c>
getpid+0x33: movl 0x34(%r9),%eax
getpid+0x37: movl %eax,-0x4(%rbp)
getpid+0x3a: jmp +0x2c <getpid+0x68>
getpid+0x3c: movq %gs:0x18,%rax
getpid+0x45: movq 0x190(%rax),%r8
getpid+0x4c: movq 0x620(%r8),%r8
getpid+0x53: movq 0x150(%r8),%r8
getpid+0x5a: movq 0xb0(%r8),%r8
getpid+0x61: movl 0x4(%r8),%eax
getpid+0x65: movl %eax,-0x4(%rbp)
getpid+0x68: movq -0x8(%rbp),%rax
getpid+0x6c: leave
getpid+0x6d: ret
Linux distributions have been around for decades now, for both servers ( business oriented
applications ) and home applications. While for specialized IT applications, different kind of hardware,
or data science, Linux may be the system of choice, for home applications, or general business Linux
has never seemed to be a viable option.
There are three main reasons for this, at least from my point of view, as a regular home user.
The first reason, which applies, to older versions especially, is that Linux was not ready to be used by
non-tech geeks. Choosing what could be the right distribution for you meant spending a lot of time on
the internet checking differences between everything that existed, looking at the main features and
trying to figure out which distribution would be the easiest for you to use. If we are looking at even older
versions of Linux, the internet search was not available, so the only information you got was from IT
specialized magazines, which did not always include the data you were looking for. When you
managed to find the what you thought was the right distribution you needed to install it on PC. This
meant that you needed to look for guides to explain each step, as most distributions had non-graphic
and hard to use installers.
The second reason is related to available software. Even you have put up the time to install a Linux
distribution on your PC, you were not able to install regular Windows software on Linux, which meant
that most commonly used software wasn’t available on Linux. Many of these applications had open
source versions, which could be used for Linux, but you would have to deal with compatibility issues,
and not being able to exchange data between Linux and Windows users easily.
The third reason was that, even you would find the right software for you, and wouldn’t need to worry
about compatibility, you first needed to install that software. This installation process was, in most
cases, painful. In most Linux distributions ( e.g. Red Hat or, later, Fedora ) you were required to
download the sources of the software you wanted to use, and run a set of commands in the shell, in
order to get that software compiled and installed.
The above steps might have been doable and relatively easy for someone with an interest in IT, but for
most home users, who just want a usable out of the box system and a one-click way of installing their
needed applications, this was a major repellent.
Linking and Loading are the utility programs that play a important role in the execution of a program.
Linking intakes the object codes generated by the assembler and combines them to generate the
executable module. On the other hand, the loading loads this executable module to the main memory
for execution.
Loading:
Bringing the program from secondary memory to main memory is called Loading.
Linking:
Establishing the linking between all the modules or all the functions of the program in order to
continue the program execution is called linking.
Differences between Linking and Loading:
1. The key difference between linking and loading is that the linking generates the executable file of
a program whereas, the loading loads the executable file obtained from the linking into main
memory for execution.
2. The linking intakes the object module of a program generated by the assembler. However, the
loading intakes the executable module generated by the linking.
3. The linking combines all object modules of a program to generate executable modules it also links
the library function in the object module to built-in libraries of the high-level programming
language. On the other hand, loading allocates space to an executable module in main memory.
STATIC DYNAMIC
Loading the entire program into the main memory
before start of the program execution is called as
static loading.
Loading the program into the main memory on
demand is called as dynamic loading.
Inefficient utilization of memory because whether Efficient utilization of memory.
it is required or not required entire program is
brought into the main memory.
If the static loading is used then accordingly static If the dynamic loading is used then accordingly
linking is applied. dynamic linking is applied.
Static linking is performed by programs called In dynamic linking this is not the case and
linkers as the last step in compiling a program. individual shared modules can be updated and
Linkers are also called link editors. recompiled. This is one of the greatest
advantages dynamic linking offers.
In static linking if any of the external programs In dynamic linking load time might be reduced if
has changed then they have to be recompiled the shared library code is already present in
and re-linked again else the changes won’t reflect memory.
in existing executable file.
Executable and linking format (ELF)
ELF is the standard binary format on operating systems such as Linux. Some of the capabilities of
ELF are dynamic linking, dynamic loading, imposing run-time control on a program, and an improved
method for creating shared libraries. The ELF representation of control data in an object file is
platform independent, which is an additional improvement over previous binary formats.
The ELF representation permits object files to be identified, parsed, and interpreted simil arly,
making the ELF object files compatible across multiple platforms and architectures of different size.
The three main types of ELF files are:
Executable
Relocatable
Shared object
These file types hold the code, data, and information about the program that the operating system
and linkage editor need to perform the appropriate actions on these files.
Static Linking:
When we click the .exe (executable) file of the program and it starts running, all the necessary
contents of the binary file have been loaded into the process’s virtual address space. However, most
programs also need to run functions from the system libraries, and these library functions also need to
be loaded.
In the simplest case, the necessary library functions are embedded directly in the program’s
executable binary file. Such a program is statically linked to its libraries, and statically linked
executable codes can commence running as soon as they are loaded.
Disadvantage:
Every program generated must contain copies of exactly the same common system library functions.
In terms of both physical memory and disk-space usage, it is much more efficient to load the system
libraries into memory only once. Dynamic linking allows this single loading to happen.
Dynamic Linking:
Every dynamically linked program contains a small, statically linked function that is called when the
program starts. This static function only maps the link library into memory and runs the code that the
function contains. The link library determines what are all the dynamic libraries which the program
requires along with the names of the variables and functions needed from those libraries by reading
the information contained in sections of the library.
After which it maps the libraries into the middle of virtual memory and resolves the references to the
symbols contained in those libraries. We don’t know where in the memory these shared libraries are
actually mapped: They are compiled into position-independent code (PIC), that can run at any
address in memory.
Advantage:
Memory requirements of the program are reduced. A DLL is loaded into memory only once, whereas
more than one application may use a single DLL at the moment, thus saving memory space.
Application support and maintenance costs are also lowered.
Example implementation
The following example uses x86 assembly language to implement a spinlock. It will work on
any Intel 80386 compatible processor.
; Intel syntax
spin_lock:
mov eax, 1 ; Set the EAX register to 1.
xchg eax, [locked] ; Atomically swap the EAX register with
; the lock variable.
; This will always store 1 to the lock,
leaving
; the previous value in the EAX register.
test eax, eax ; Test EAX with itself. Among other things,
this will
; set the processor's Zero Flag if EAX is 0.
; If EAX is 0, then the lock was unlocked and
; we just locked it.
; Otherwise, EAX is 1 and we didn't acquire
the lock.
jnz spin_lock ; Jump back to the MOV instruction if the Zero
Flag is
; not set; the lock was previously locked,
and so
; we need to spin until it becomes unlocked.
ret ; The lock has been acquired, return to the
calling
; function.
spin_unlock:
xor eax, eax ; Set the EAX register to 0.
xchg eax, [locked] ; Atomically swap the EAX register with
; the lock variable.
ret ; The lock has been released.
Locking mechanisms are used by kernel is also used by user-level threads, so that the locks are
available both inside and outside of the kernel. The difference is only that priority-inheritance in only
used in kernel, user-level thread does not provide this functionality.
To optimize Solaris performance, developers refine the locking methods as locks are used frequently
and typically for crucial kernel functions, tuning their implementations and use to gain great
performance.
Contrary to popular belief, the Windows operating system is not any less in regard to performance
compared to its contemporary like macOS and Linux. In this article, we will look into various ways you
can optimize the performance of your Windows 10 machine.
Here are 13 ways you can optimize the performance of your Windows 10 computer.
1. Upgrading Windows
Updating Windows contains new features, better security, eliminate bugs, and also included
performance enhancements. To update Windows, follow these steps –
1. Open Settings.
2. Click on Update & Security.
3. Click on Windows Update.
4. Click the Check for updates button and download the updates.
1. Preemptive Kernel :
Preemptive Kernel, as name suggests, is a type of kernel that always executes the highest priority
task that are ready to run. It cannot use-non-reentrant functions unless and until functions are mutual
exclusive.
Example : Linux 2-6
2. Non-Preemptive Kernel :
Non-Preemptive Kernel, as name suggests, is a type of kernel that is free from race conditions on
kernel data structures as only one process is active in kernel at a time. It is considered as a serious
drawback for real time applications as it does not allow preemption of process running in kernel mode.
Example : Linux 2.4
It is a process that might be replaced It is a process that continuous to run until it finishes
immediately. handling execution handler or voluntarily
relinquishes CPU.
It is more suitable for real time programming It is less suitable for real-time programming as
as compared to non-preemptive kernels. compared to preemptive kernel.
In this, higher priority task that are ready to In this, each and every task are explicitly given up
run is given CPU control. CPU control.
It generally allows preemption even in kernel It generally does not allow preemption of process
mode. running in kernel mode.
Responsive time is deterministic and is more Response time is nondeterministic and is less
responsive as compared to non-preemptive responsive as compared to preemptive kernel.
kernel.
Higher priority task becomes ready, currently Higher priority task might have to wait for long
running task is suspended and moved to time.
ready queue.
They are more secure and more useful in They are less secure and less useful in real-world
real-world scenarios. scenarios.
Effects of modern Memory hierarchies and related optimizations
In computer architecture, the memory hierarchy separates computer storage into a hierarchy based
on response time. Since response time, complexity, and capacity are related, the levels may also be
distinguished by their performance and controlling technologies.Memory hierarchy affects
performance in computer architectural design, algorithm predictions, and lower
level programming constructs involving locality of reference.
Designing for high performance requires considering the restrictions of the memory hierarchy, i.e. the
size and capabilities of each component. Each of the various components can be viewed as part of a
hierarchy of memories (m 1, m2, ..., mn) in which each member mi is typically smaller and faster than
the next highest member mi+1 of the hierarchy. To limit waiting by higher levels, a lower level will
respond by filling a buffer and then signaling for activating the transfer.
There are four major storage levels.
1. Process:
Process is an activity of executing a program. Process is of two types – User process and System
process. Process control block controls the operation of the process.
2. Kernel Thread:
Kernel thread is a type of thread in which threads of a process are managed at kernel level. Kernel
threads are scheduled by operating system (kernel mode).
Process is a program being executed. Kernel thread is the thread managed at kernel
level.
Suspension of a process does not affect Suspension of kernel thread leads to all the
other processes. threads stop running.
Its types are – user process and system Its types are – kernel level single thread and
process. kernel level multi thread.
Process Table and Process Control Block (PCB)
While creating a process the operating system performs several operations. To identify the
processes, it assigns a process identification number (PID) to each process. As the operating system
supports multi-programming, it needs to keep track of all the processes. For this task, the process
control block (PCB) is used to track the process’s execution status. Each block of memory contains
information about the process state, program counter, stack pointer, status of opened files, scheduling
algorithms, etc. All these information is required and must be saved when the process is switched
from one state to another. When the process makes a transition from one state to another, the
operating system must update information in the process’s PCB.
A process control block (PCB) contains information about the process, i.e. registers, quantum, priority,
etc. The process table is an array of PCB’s, that means logically contains a PCB for all of the current
processes in the system.
Pointer – It is a stack pointer which is required to be saved when the process is switched from
one state to another to retain the current position of the process.
Process state – It stores the respective state of the process.
Process number – Every process is assigned with a unique id known as process ID or PID which
stores the process identifier.
Program counter – It stores the counter which contains the address of the next instruction that is
to be executed for the process.
Register – These are the CPU registers which includes: accumulator, base, registers and general
purpose registers.
Memory limits – This field contains the information about memory management system used by
operating system. This may include the page tables, segment tables etc.
Open files list – This information includes the list of files opened for a process.
Miscellaneous accounting and status data – This field includes information about the amount of
CPU used, time constraints, jobs or process number, etc.
The process control block stores the register content also known as execution content of the
processor when it was blocked from running. This execution content architecture enables the
operating system to restore a process’s execution context when the process returns to the running
state. When the process makes a transition from one state to another, the operating system updates
its information in the process’s PCB. The operating system maintains pointers to each process’s PCB
in a process table so that it can access the PCB quickly.
Lookup
You can configure the stage to complete a lookup operation on the database for each input record
(referred to as the key record) and return rows that match the criteria that are specified by that record.
The lookup operation is performed by running a parameterized SELECT statement which contains
a WHERE clause with parameters that are associated with the columns marked as Key columns in
the records that represent key records for the lookup.
On server canvas the job configuration for the database lookup requires that the Transformer stage is
used in combination with the database stage. The Transformer stage has an input link on which the
key records arrive to be used as input for the lookup query. It also has one or more reference links
coming from the database stage. The database stage is provided with input key records on this link.
For each input key record, the database stage runs the parameterized SELECT statement with the
key record values used in the WHERE clause and provides the corresponding matching records to
the Transformer stage. Those records are then processed and routed by the Transformer stage to
one or more of its output links to be further processed by the downstream stages in the job.
In some cases the SELECT lookup statement may return multiple record matches. The user can
specify whether the stage should log a message when this happens.
Memory allocation:
To gain proper memory utilization, memory allocation must be allocated efficient manner. One of the
simplest methods for allocating memory is to divide memory into several fixed-sized partitions and
each partition contains exactly one process. Thus, the degree of multiprogramming is obtained by
the number of partitions.
Multiple partition allocation: In this method, a process is selected from the input queue and loaded
into the free partition. When the process terminates, the partition becomes available for other
processes.
Fixed partition allocation: In this method, the operating system maintains a table that indicates
which parts of memory are available and which are occupied by processes. Initially, all memory is
available for user processes and is considered one large block of available memory. This available
memory is known as “Hole”. When the process arrives and needs memory, we search for a hole that
is large enough to store this process. If the requirement fulfills then we allocate memory to process,
otherwise keeping the rest available to satisfy future requests. While allocating a memory sometimes
dynamic storage allocation problems occur, which concerns how to satisfy a request of size n from a
list of free holes. There are some solutions to this problem:
First fit:-
In the first fit, the first available free hole fulfills the requirement of the process allocated.
Here, in this diagram 40 KB memory block is the first available free hole that can store process A
(size of 25 KB), because the first two blocks did not have sufficient memory space.
Best fit:-
In the best fit, allocate the smallest hole that is big enough to process requirements. For this, we
search the entire list, unless the list is ordered by size.
Here in this example, first, we traverse the complete list and find the last hole 25KB is the best
suitable hole for Process A(size 25KB).
In this method memory utilization is maximum as compared to other memory allocation techniques.
Worst fit:-In the worst fit, allocate the largest available hole to process. This method produces the
largest leftover hole.
Here in this example, Process A (Size 25 KB) is allocated to the largest available memory block
which is 60KB. Inefficient memory utilization is a major issue in the worst fit.
Proc file system (procfs) is virtual file system created on fly when system boots and is dissolved at
time of system shut down.
It contains useful information about the processes that are currently running, it is regarded as control
and information center for kernel.
The proc file system also provides communication medium between kernel space and user space.
Below is snapshot of /proc from my PC.
ls -l /proc
total 0
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1
dr-xr-xr-x 9 root root 0 Mar 31 21:34 10
dr-xr-xr-x 9 avahi avahi 0 Mar 31 21:34 1034
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1036
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1039
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1041
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1043
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1044
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1048
dr-xr-xr-x 9 root root 0 Mar 31 21:34 105
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1078
dr-xr-xr-x 9 root root 0 Mar 31 21:34 11
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1121
dr-xr-xr-x 9 lp lp 0 Mar 31 21:34 1146
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1149
dr-xr-xr-x 9 mysql mysql 0 Mar 31 21:34 1169
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1180
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1181
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1182
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1183
dr-xr-xr-x 9 postgres postgres 0 Mar 31 21:34 1184
dr-xr-xr-x 9 root root 0 Mar 31 21:34 1186
dr-xr-xr-x 9 root root 0 Mar 31 21:34 12
...
If you list the directories, you will find that for each PID of a process there is
dedicated directory.
You can check directories only on terminal using
ls -l /proc | grep '^d'
Now let’s check for particular process of assigned PID, you can get the PID of any running process
from ps command
ps -aux
Output:
Now check the highlighted process with PID=7494, you can check that there is entry
for this process in /proc file system.
ls -ltr /proc/7494
Output:
total 0
-rw-r--r-- 1 mandeep mandeep 0 Apr 1 01:14 oom_score_adj
dr-xr-xr-x 13 mandeep mandeep 0 Apr 1 01:14 task
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:16 status
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:16 stat
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:16 cmdline
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 wchan
-rw-r--r-- 1 mandeep mandeep 0 Apr 1 01:17 uid_map
-rw-rw-rw- 1 mandeep mandeep 0 Apr 1 01:17 timerslack_ns
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 timers
-r-------- 1 mandeep mandeep 0 Apr 1 01:17 syscall
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 statm
-r-------- 1 mandeep mandeep 0 Apr 1 01:17 stack
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 smaps
-rw-r--r-- 1 mandeep mandeep 0 Apr 1 01:17 setgroups
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 sessionid
-r--r--r-- 1 mandeep mandeep 0 Apr 1 01:17 schedstat
-rw-r--r-- 1 mandeep mandeep 0 Apr 1 01:17 sched
lrwxrwxrwx 1 mandeep mandeep 0 Apr 1 01:17 root ->
/proc/2341/fdinfo
-rw-r--r-- 1 mandeep mandeep 0 Apr 1 01:17 projid_map
-r-------- 1 mandeep mandeep 0 Apr 1 01:17 personality
...
In linux, /proc includes a directory for each running process, including kernel
processes, in directories named /proc/PID, these are the directories present:
directory description
/proc/PID/cmdline Command line arguments.
/proc/PID/cpu Current and last cpu in which it was executed.
/proc/PID/cwd Link to the current working directory.
/proc/PID/environ Values of environment variables.
/proc/PID/exe Link to the executable of this process.
/proc/PID/fd Directory, which contains all file descriptors.
/proc/PID/maps Memory maps to executables and library files.
/proc/PID/mem Memory held by this process.
/proc/PID/root Link to the root directory of this process.
/proc/PID/stat Process status.
/proc/PID/statm Process memory status information.
/proc/PID/status Process status in human readable form.
name : ccm(aes)
driver : ccm_base(ctr(aes-aesni), cbcmac(aes-aesni))
module : ccm
priority : 300
refcnt : 2
selftest : passed
internal : no
type : aead
async : no
blocksize : 1
ivsize : 16
maxauthsize : 16
geniv :
name : ctr(aes)
driver : ctr(aes-aesni)
module : kernel
priority : 300
refcnt : 3
selftest : passed
internal : no
type : blkcipher
blocksize : 1
min keysize : 16
max keysize : 32
ivsize : 16
geniv : chainiv
Optimizations
Optimization and observability go hand in hand in the sense that optimizing performance first requires
that you have visibility. When a system is observable, you’re able to know the current state/behavior
of the system and where performance bottlenecks exist. If a team lacks this insight into their system,
they will resort to guessing, so observability plays a key role in managing and optimizing
performance.
Virtual file system and the layering of a file system call from API to
driver
A virtual file system (VFS) is programming that forms an interface between an operating
system's kernel and a more concrete file system.
The VFS serves as an abstraction layer that gives applications access to different types of file
systems and local and network storage devices. For that reason, a VFS may also be known as
a virtual file system switch. It also manages the data storage and retrieval between the operating
system and the storage sub-system. The VFS maintains a cache of directory lookups to enable easy
location of frequently accessed directories.
Sun Microsystems introduced one of the first VFSes on Unix-like systems. The VMware Virtual
Machine File System (VMFS), NTFS, Linux's Global File System (GFS) and the Oracle Clustered File
System (OCFS) are all examples of virtual file systems.
API Layering requires that binaries in Windows Driver packages call only those APIs and DDIs that
are included in UWP-based editions of Windows 10 or are from a curated set of Win32 APIs. API
Layering is an extension of the previous "U" requirement that was a part of DCHU design principles.
To see which platform an API supports, visit the documentation page for the API and examine
the Target Platform entry of the Requirements section. Windows Drivers must only use APIs or DDIs
that have a Target Platform listed as Universal, meaning the subset of functionality that is available
on all Windows offerings.
The Windows API Sets page describes a set of best practices and tools for determining whether an
API is available on a particular platform.
Api Validator is the main tool used to validate API Layering compliance for Windows Drivers. Api
Validator ships as part of the Windows Driver Kit (WDK).
See Validating Windows Drivers for more details on using Api Validator to verify that a Windows
Driver meets the API Layering requirement.
Instead of looking to the language to provide guidance, a software engineer must look to established
practice to find out what works well and what is best avoided. Interpreting established practice is not
always as easy as one might like and the effort, once made, is worth preserving. To preserve that
effort on your author's part, this article brings another installment in an occasional series on Linux
Kernel Design Patterns and attempts to set out - with examples - the design patterns in the Linux
Kernel which effect an object-oriented style of programming.
Rather than providing a brief introduction to the object-oriented style, tempting though that is, we will
assume the reader has a basic knowledge of objects, classes, methods, inheritance, and similar
terms. For those as yet unfamiliar with these, there are plenty of resources to be found elsewhere on
the web.
Method Dispatch
The large variety of styles of inheritance and rules for its usage in languages today seems to suggest
that there is no uniform understanding of what "object-oriented" really means. The term is a bit like
"love": everyone thinks they know what it means but when you get down to details people can find
they have very different ideas. While what it means to be "oriented" might not be clear, what we mean
by an "object" does seem to be uniformly agreed upon. It is simply an abstraction comprising both
state and behavior. An object is like a record (Pascal) or struct (C), except that some of the names of
members refer to functions which act on the other fields in the object. These function members are
sometimes referred to a "methods".
The first observation is that some function pointers in some vtables are allowed to be NULL. Clearly
trying to call such a function would be futile, so the code that calls into these methods generally
contains an explicit test for the pointer being NULL. There are a few different reasons for these NULL
pointers. Probably easiest to justify is the incremental development reason. Because of the way
vtable structures are initialized, adding a new function pointer to the structure definition causes all
existing table declarations to initialise that pointer to NULL. Thus it is possible to add a caller of the
new method before any instance supports that method, and have it check for NULL and perform a
default behavior. Then as incremental development continues those vtable instances which need it
can get non-default methods.
Generics is the idea to allow type (Integer, String, … etc and user-defined types) to be a parameter to
methods, classes and interfaces. For example, classes like an array, map, etc, which can be used
using generics very efficiently. We can use them for any type.
The method of Generic Programming is implemented to increase the efficiency of the code. Generic
Programming enables the programmer to write a general algorithm which will work with all data types.
It eliminates the need to create different algorithms if the data type is an integer, string or a character.
The advantages of Generic Programming are
1. Code Reusability
2. Avoid Function Overloading
3. Once written it can be used for multiple times and cases.
Generics can be implemented in C++ using Templates. Template is a simple and yet very powerful
tool in C++. The simple idea is to pass data type as a parameter so that we don’t need to write the
same code for different data types. For example, a software company may need sort() for different
data types. Rather than writing and maintaining the multiple codes, we can write one sort() and pass
data type as a parameter.
We write a generic function that can be used for different data types. Examples of function templates
are sort(), max(), min(), printArray()
#include <iostream>
T myMax(T x, T y)
return (x > y) ? x : y;
int main()
Output:
7
7
g
Address space is the amount of memory allocated for all possible addresses for a
computational entity -- for example, a device, a file, a server or a networked computer. The system
provides each device and process address space that holds a specific portion of the processor's
address space.
Tying top-down and bottom-up object and memory page lookups
with the actual x86 page translation and segmentation
Paging:
Paging is a method or techniques which is used for non-contiguous memory allocation. It is a fixed
size partitioning theme (scheme). In paging, both main memory and secondary memory are divided
into equal fixed size partitions. The partitions of secondary memory area unit and main memory
area unit known as as pages and frames respectively.
Paging is a memory management method accustomed fetch processes from the secondary
memory into the main memory in the form of pages. in paging, each process is split into parts
wherever size of every part is same as the page size. The size of the last half could also be but the
page size. The pages of process area unit hold on within the frames of main memory relying upon
their accessibility.
Segmentation:
Segmentation is another non-contiguous memory allocation scheme like paging. like paging, in
segmentation, process isn’t divided indiscriminately into mounted(fixed) size pages. It is variable
size partitioning theme. like paging, in segmentation, secondary and main memory are not divided
into partitions of equal size. The partitions of secondary memory area unit known as as segments.
The details concerning every segment are hold in a table known as as segmentation table.
Segment table contains two main data concerning segment, one is Base, which is the bottom
address of the segment and another is Limit, which is the length of the segment.
In segmentation, CPU generates logical address that contains Segment number and segment
offset. If the segment offset is a smaller amount than the limit then the address called valid address
otherwise it throws miscalculation because the address is invalid.
The above figure shows the translation of logical address to physical address.
There are many file operations that can be perform by the computer system.
Now let's describe briefly about all the above most common operations that can be performed with
files.
File must has to be deleted when it is no longer needed just to free up the disk space.
The file must be closed to free up the internal table space, when all the accesses are finished and the
attributes and the disk addresses are no longer needed.
The file read operation is performed just to read the data that are stored in the required file.
The file write operation is used to write the data to the file, again, generally at the current position.
The file append operation is same as the file write operation except that the file append operation only
add the data at the end of the file.
For random access files, a method is needed just to specify from where to take the data. Therefore,
the file seek operation performs this task.
The file get attributes operation are performed by the processes when they need to read the file
attributes to do their required work.
The file set attribute operation used to set some of the attributes (user settable attributes) after the file
has been created.
The file rename operation is used to change the name of the existing file.
I/O buffering
A buffer is a memory area that stores data being transferred between two devices or between a
device and an application.
1. Single buffer :
A buffer is provided by the operating system to the system portion of the main memory.
Block oriented device –
System buffer takes the input.
After taking the input, the block gets transferred to the user space by the process and then the
process requests for another block.
Two blocks works simultaneously, when one block of data is processed by the user process, the
next block is being read in.
OS can swap the processes.
OS can record the data of system buffer to user processes.
Stream oriented device –
Line- at a time operation is used for scroll made terminals. User inputs one line at a time, with a
carriage return signaling at the end of a line.
Byte-at a time operation is used on forms mode, terminals when each keystroke is significant.
2. Double buffer :
Block oriented –
There are two buffers in the system.
One buffer is used by the driver or controller to store data while waiting for it to be taken by higher
level of the hierarchy.
Other buffer is used to store data from the lower level module.
Double buffering is also known as buffer swapping.
A major disadvantage of double buffering is that the complexity of the process get increased.
If the process performs rapid bursts of I/O, then using double buffering may be deficient.
Stream oriented –
Line- at a time I/O, the user process need not be suspended for input or output, unless process
runs ahead of the double buffer.
Byte- at a time operations, double buffer offers no advantage over a single buffer of twice the
length.
3. Circular buffer :
When more than two buffers are used, the collection of buffers is itself referred to as a circular
buffer.
In this, the data do not directly passed from the producer to the consumer because the data would
change due to overwriting of buffers before they had been consumed.
The producer can only fill up to buffer i-1 while data in buffer i is waiting to be consumed.
is to access the data present in the hard disk and bring it to RAM
so that the application programs can use it. The thing to remember is that swapping is used only
when data is not present in RAM
.
Although the process of swapping affects the performance of the system, it helps to run larger and more
than one process. This is the reason why swapping is also referred to as memory compaction.
The concept of swapping has divided into two more concepts: Swap-in and Swap-out.
o Swap-out is a method of removing a process from RAM and adding it to the hard disk.
o Swap-in is a method of removing a program from a hard disk and putting it back into the main
memory or RAM.
Example: Suppose the user process's size is 2048KB and is a standard hard disk where swapping
has a data transfer rate of 1Mbps. Now we will calculate how long it will take to transfer from main
memory to secondary memory.
Advantages of Swapping
1. It helps the CPU to manage multiple processes within a single main memory.
3. Swapping allows the CPU to perform multiple tasks simultaneously. Therefore, processes do
not have to wait very long before they are executed.
Disadvantages of Swapping
1. If the computer system loses power, the user may lose all information related to the program in
case of substantial swapping activity.
2. If the swapping algorithm is not good, the composite method can increase the number of Page
Fault and decrease the overall processing performance.
Note:
o In a single tasking operating system, only one process occupies the user program area of
memory and stays in memory until the process is complete.
o In a multitasking operating system, a situation arises when all the active processes cannot
coordinate in the main memory, then a process is swap out from the main memory so that other
processes can enter it.
The Solaris kernel memory (kmem) allocator provides a powerful set of debugging features that can
facilitate analysis of a kernel crash dump. This chapter discusses these debugging features, and the
MDB dcmds and walkers designed specifically for the allocator. Bonwick provides an overview of the
principles of the allocator itself. Refer to the header file <sys/kmem_impl.h> for the definitions of
allocator data structures. The kmem debugging features can be enabled on a production system to
enhance problem analysis, or on development systems to aid in debugging kernel software and device
drivers.
Note –
This guide reflects Solaris 9 implementation; this information might not be relevant, correct, or
applicable to past or future releases, since it reflects the current kernel implementation. It does not
define a public interface of any kind. All of the information provided about the kernel memory allocator
is subject to change in future Solaris releases.
Vmem Allocator
The kmem allocator relies on two lower-level system services to create slabs: a virtual address
allocator to provide kernel virtual addresses, and VM routines to back those addresses with physical
pages and establish virtual-to-physical translations. The scalability of large systems was limited by the
old virtual address allocator (the resource map allocator). It tended to fragment the address space
badly over time, its latency was linear in the number of fragments, and the whole thing was single-
threaded.
Virtual address allocation is, however, just one example of the more general problem of resource
allocation. For our purposes, a resource is anything that can be described by a set of integers. For
example: virtual addresses are subsets of the 64-bit integers; process IDs are subsets of the integers
[0, 30000]; and minor device numbers are subsets of the 32-bit integers.
In this section we describe the new general-purpose resource allocator, vmem, which provides
guaranteed constant-time performance with low fragmentation. Vmem appears to be the first resource
allocator that can do this.
We begin by providing background on the current state of the art. We then lay out the objectives of
vmem, describe the vmem interfaces, explain the implementation in detail, and discuss vmem's
performance (fragmentation, latency, and scalability) under both benchmarks and real-world
conditions.
To gain proper memory utilization, memory allocation must be allocated efficient manner. One of the
simplest methods for allocating memory is to divide memory into several fixed-sized partitions and
each partition contains exactly one process. Thus, the degree of multiprogramming is obtained by
the number of partitions.
Multiple partition allocation: In this method, a process is selected from the input queue and loaded
into the free partition. When the process terminates, the partition becomes available for other
processes.
Fixed partition allocation: In this method, the operating system maintains a table that indicates
which parts of memory are available and which are occupied by processes. Initially, all memory is
available for user processes and is considered one large block of available memory. This available
memory is known as “Hole”. When the process arrives and needs memory, we search for a hole that
is large enough to store this process. If the requirement fulfills then we allocate memory to process,
otherwise keeping the rest available to satisfy future requests. While allocating a memory sometimes
dynamic storage allocation problems occur, which concerns how to satisfy a request of size n from a
list of free holes. There are some solutions to this problem:
First fit:-
In the first fit, the first available free hole fulfills the requirement of the process allocated.
Here, in this diagram 40 KB memory block is the first available free hole that can store process A
(size of 25 KB), because the first two blocks did not have sufficient memory space.
Best fit:-
In the best fit, allocate the smallest hole that is big enough to process requirements. For this, we
search the entire list, unless the list is ordered by size.
Here in this example, first, we traverse the complete list and find the last hole 25KB is the best
suitable hole for Process A(size 25KB).
In this method memory utilization is maximum as compared to other memory allocation techniques.
Worst fit:-In the worst fit, allocate the largest available hole to process. This method produces the
largest leftover hole.
Here in this example, Process A (Size 25 KB) is allocated to the largest available memory block
which is 60KB. Inefficient memory utilization is a major issue in the worst fit.
Challenges of multiple CPUs and Memory Hierarchy
In the Computer System Design, Memory Hierarchy is an enhancement to organize the memory such
that it can minimize the access time. The Memory Hierarchy was developed based on a program
behavior known as locality of references.The figure below clearly demonstrates the different levels of
memory hierarchy :
Authentication
One Time passwords
Program Threats
System Threats
Computer Security Classifications
Integrity
An integrity checking and recovery (ICAR) system is presented here, which protects file system
integrity and automatically restores modified files. The system enables files cryptographic hashes
generation and verification, as well as configuration of security constraints. All of the crucial data,
including ICAR system binaries, file backups and hashes database are stored in a physically write-
protected storage to eliminate the threat of unauthorised modification. A buffering mechanism was
designed and implemented in the system to increase operation performance. Additionally, the system
supplies user tools for cryptographic hash generation and security database management. The
system is implemented as a kernel extension, compliant with the Linux security model. Experimental
evaluation of the system was performed and showed an approximate 10% performance degradation
in secured file access compared to regular access.
Isolation
Process isolation in computer programming is the segregation of different software processes to
prevent them from accessing memory space they do not own. The concept of process isolation helps
to improve operating system security by providing different privilege levels to certain programs and
restricting the memory those programs can use. Although there are many implementations of process
isolation, it is frequently used in web browsers to separate multiple tabs and to protect the core
browser itself should a process fail. It can be hardware based or software based, but both serve the
same purpose of limiting access to system resources and keeping programs isolated to their
own virtual address space.
Mediation
Mediation is an informal and flexible dispute resolution process. The mediator's role is to guide
the parties toward their own resolution. Through joint sessions and separate caucuses with parties,
the mediator helps both sides define the issues clearly, understand each other's position and move
closer to resolution.
Auditing
Auditing systems in modern operating systems collect detailed information about security-related
events. The audit or security logs generated by an auditing system facilitate identification of
attempted attacks, security policy improvement, security incident investigation, and review by
auditors.
MULTICS and MLS to modern UNIX
Multics ("Multiplexed Information and Computing Service") is an influential early time-
sharing operating system based on the concept of a single-level memory.It has been said that Multics
"has influenced all modern operating systems since, from microcomputers to mainframes."
Initial planning and development for Multics started in 1964, in Cambridge, Massachusetts. Originally
it was a cooperative project led by MIT (Project MAC with Fernando Corbató) along with General
Electric and Bell Labs. It was developed on the GE 645 computer, which was specially designed for it;
the first one was delivered to MIT in January, 1967.
Multics was conceived as a commercial product for General Electric, and became one for Honeywell,
albeit not very successfully. Due to its many novel and valuable ideas, Multics has had a significant
influence on computer science despite its faults.
Multics has numerous features intended to ensure high availability so that it would support
a computing utility similar to the telephone and electricity utilities. Modular hardware structure and
software architecture are used to achieve this. The system can grow in size by simply adding more of
the appropriate resource, be it computing power, main memory, or disk storage. Separate access
control lists on every file provide flexible information sharing, but complete privacy when needed.
Multics has a number of standard mechanisms to allow engineers to analyze the performance of the
system, as well as a number of adaptive performance optimization mechanisms.
SELinux was released to the open source community in 2000, and was integrated into the upstream
Linux kernel in 2003.
SELinux defines access controls for the applications, processes, and files on a system. It uses
security policies, which are a set of rules that tell SELinux what can or can’t be accessed, to enforce
the access allowed by a policy.
When an application or process, known as a subject, makes a request to access an object, like a file,
SELinux checks with an access vector cache (AVC), where permissions are cached for subjects and
objects.
If SELinux is unable to make a decision about access based on the cached permissions, it sends the
request to the security server. The security server checks for the security context of the app or
process and the file. Security context is applied from the SELinux policy database. Permission is then
granted or denied.
Implementations SELinux
SELinux is set up to default-deny, which means that every single access for which it has a hook in the
kernel must be explicitly allowed by policy. This means a policy file is comprised of a large amount of
information regarding rules, types, classes, permissions, and more. A full consideration of SELinux is
out of the scope of this document, but an understanding of how to write policy rules is now essential
when bringing up new Android devices. There is a great deal of information available regarding
SELinux already. See Supporting documentation for suggested resources.
Key files
To enable SELinux, integrate the latest Android kernel and then incorporate the files found in
the system/sepolicy directory. When compiled, those files comprise the SELinux kernel security policy
and cover the upstream Android operating system.
In general, you should not modify the system/sepolicy files directly. Instead, add or edit your own
device-specific policy files in the /device/manufacturer/device-name/sepolicy directory. In Android 8.0
and higher, the changes you make to these files should only affect policy in your vendor directory. For
more details on separation of public sepolicy in Android 8.0 and higher, see Customizing SEPolicy in
Android 8.0+. Regardless of Android version, you're still modifying these files:
Policy files
Files that end with *.te are SELinux policy source files, which define domains and their labels. You
may need to create new policy files in /device/manufacturer/device-name/sepolicy, but you should try
to update existing files where possible.
Context files
Context files are where you specify labels for your objects.
file_contexts assigns labels to files and is used by various userspace components. As you create new
policies, create or update this file to assign new labels to files. To apply new file_contexts, rebuild the
filesystem image or run restorecon on the file to be relabeled. On upgrades, changes
to file_contexts are automatically applied to the system and userdata partitions as part of the upgrade.
Changes can also be automatically applied on upgrade to other partitions by
adding restorecon_recursive calls to your init.board.rc file after the partition has been mounted read-
write.
genfs_contexts assigns labels to filesystems, such as proc or vfat that do not support extended
attributes. This configuration is loaded as part of the kernel policy but changes may not take effect for
in-core inodes, requiring a reboot or unmounting and re-mounting the filesystem to fully apply the
change. Specific labels may also be assigned to specific mounts, such as vfat using
the context=mount option.
property_contexts assigns labels to Android system properties to control what processes can set
them. This configuration is read by the init process during startup.
service_contexts assigns labels to Android binder services to control what processes can add
(register) and find (lookup) a binder reference for the service. This configuration is read by
the servicemanager process during startup.
seapp_contexts assigns labels to app processes and /data/data directories. This configuration is read
by the zygote process on each app launch and by installd during startup.
mac_permissions.xml assigns a seinfo tag to apps based on their signature and optionally their
package name. The seinfo tag can then be used as a key in the seapp_contexts file to assign a
specific label to all apps with that seinfo tag. This configuration is read by system_server during
startup.
keystore2_key_contexts assigns labels to Keystore 2.0 namespaces. These namespace are enforced
by the keystore2 daemon. Keystore has always provided UID/AID based namespaces. Keystore 2.0
additionally enforces sepolicy defined namespaces. A detailed description of the format and
conventions of this file can be found here.
Malware that is non-intrusive, which means that the malware does not perform any modification to
the OS or processes within the OS in any way.
Malware that is intrusive by modifying things which should never be modified (For example
kernel code, BIOS which has its HASH storedin TPM, MSR registers, and so on.). This type of
malware in general is easier to spot.
Malware that is intrusive by modifying things which are designed to be modified (DATA sections),
this malware category in general is much harder to spot.
Hooking Techniques
There are about five main hooking techniques:
IAT Hooks
Inline Hooks
SSDT Hooks
SYSENTER_EIP Hook
IRP Major Function Hook
IAT Hooks
The Import Address Table (IAT) is a really a lookup table to when an application is calling a function in
a different module. It can be in the form of both import by ordinal and import by name. Because a
compiled program cannot know the memory location of the libraries it depends upon, an indirect jump
is required whenever an API call is made. The general format looks like the following: "jmp dword ptr
ds:[address]". Because functions in DLLs change address, instead of calling a DLL function directly,
the application will make a call to the relevant jmp in its own jump table. When the application is
executed the loader will place a pointer to each required DLL function at the appropriate address in
the IAT.
Bypass - Detection
Since the Export Address Table (EAT) of each DLL remains intact, an application could easily bypass
IAT hooks by just calling ‘GetProcAddress’ in order to get the real address of each DLL function. In
order to prevent this type of a bypass, a rootkit would likely hook
‘GetProcAddress’/’LdrGetProcedureAddress’ and use that to return fake addresses. In order to
bypass these type of hooks, manually implementing a local ‘GetProcAddress’ and use that local
function to get the actual or real function address.
Inline Hooks
Also called trampoline or detours hooks is a method of receiving control when calling a
function, before the function has done its job. The flow of execution is redirected by modifying the first
few (usually five) bytes of a target function. A standard way to achieve this is to overwrite the first 5
bytes of the function with a jump to a malicious block of code, the malicious code can then read the
function arguments and do whatever it needs. If the malicious code requires results from the original
function (the one it hooked). it may call the function by executing the five bytes that were overwritten,
then jump five bytes into the original function, which will miss the malicious jump/call to avoid infinite
recursion or redirection.
The above method of bypassing DLL hooks practically involves writing your own implementation
of LoadLibrary. Another method is to use manual DLL loading to detect or even fix EAT hooks, also,
EAT hooks are very uncommon.
For inline hooks within drivers, scanning for jmp/call instructions that point outside of the driver body
is much more likely to result in false positives, however, non-standard drivers that are the target of
jumps/calls inside standard kernel drivers should raise a red flag. It is also possible to read drivers
from disk. As drivers generally do not export many functions, and IRP major function pointers are only
initialized at runtime, it would probably be required to compare the entire code section of the original
and new driver. It is important to note that relative calls/jumps are susceptible to changes during
relocation, this means that there will be some differences between the original and new driver,
however both relative calls/jumps should point to the same place.
SSDT Hooks
The System Service Dispatch Table (SSDT) is a table of pointers for various Zw/Nt [6] functions, that
are callable from user mode. A malicious application can replace pointers in the SSDT with pointers to
its own code.
Detection
All pointers in the SSDT should point to code within ntoskrnl, if any pointer is pointing outside
of ntsokrnl it is likely hooked. It's possible a rootkit could modify ntoskrnl.exe or one of the related
modules in memory, and slip some code into an empty space, in which case the pointer would still
point to within ntoskrnl. As far as I'm aware, functions starting with "Zw" are intercepted by SSDT
hooks, but those beginning with "Nt" are not, therefore an application should be able to detect SSDT
hooks by comparing Nt* function addresses with the equivalent pointer in the SSDT.
Bypass
A simple way to bypass SSDT hooks would be by calling only Nt* functions instead of the Zw*
equivalent. It is also possible to find the original SSDT by loading ntoskrnl.exe (this can be done with
LoadLibraryEx in user mode) then finding the export "KeServiceDescriptorTable" and using it to
calculate the offset of KiServiceTable within the disk image (user mode applications can use
NtQuerySystemInformation to get the kernel base address), a kernel driver would be required to
replace the SSDT.
SYSENTER_EIP Hook
SYSENTER_EIP points to the code to be executed when the SYSENTER instruction is used. User
mode applications use SYSENTER to transition into kernel mode and call a kernel function (Those
beginning with Nt/Zw), usually it would point to KiFastCallEntry, but can be replaced in order to hook
all user mode calls to kernel functions.
Detection / Bypass
SYSENTER_EIP hooking does not affect kernel mode drivers, and cannot be bypassed
from user mode. In order to allow user mode applications to bypass this hook, a kernel driver must set
SYSENTER_EIP to its original value (KiFastCallEntry), this can be done using the WRMSR
instruction, however since KiFastCallEntry is not exported by ntoskrnl, getting the address would not
be simple.
IRP Major Function Hook
The driver object of each driver, contains a table of twenty-eight functions pointer, these pointers are
to be called by other drivers via IoCallDriver, the pointers correspond to operations such as read/write
(IRP_MJ_READ/IRP_MJ_WRITE). These pointers can easily be replaced by another driver.
Detection
In general, all IRP major function pointers for a driver should point to code within the driver's address
space. This is not always the case, but is a good start to identifying malicious drivers which have
redirected the IRP major functions of legitimate drivers to their own code.
Bypass
Due to IRP major function pointers being initialized from within the driver entry point - during runtime,
it's not really possible to get the original values by reading the original driver from disk, there are also
issues with loading a new copy of the driver due to collisions. The only way for bypassing these types
of hooks would be calling the lower driver (Drivers are generally stacked and the top driver passes the
data to the driver below and so on, if the lowest driver isn't hooked, an application could just send the
request directly to the lowest driver).
A trap is a software-produced interrupt that can be caused by various factors, including an error in
instruction, such as division by zero or illegal memory access. A trap may also be generated when a
user program makes a definite service request from the OS.
Traps are called synchronous events because the execution of the present instructions much more
likely causes traps. System calls are another type of trap in which the program asks the operating
system to seek a certain service, and the operating system subsequently generates an interrupt to allow
the program to access the services.
35.2M
706
The traps are more active as an interrupt because the code will heavily depend on the fact that the trap
may be used to interact with the OS. Therefore, traps would repeat the trap's function to access any
system service.
The user program on the CPU usually makes use of library calls to make system calls. The library
routine check's job is to validate the program's parameters, create a data structure to transfer the
arguments from the application to the operating system's kernel, and then execute special instructions
known as traps or software interrupts.
These special instructions or traps has operands that aid in determining which kernel service the
application inputs require. As a result, when the process is set to execute the traps, the interrupt saves
the user code's state, switches to supervisor mode, and then dispatches the relevant kernel procedure
that may offer the requested service.
Unix file system is a logical method of organizing and storing large amounts of information in a way
that makes it easy to manage. A file is a smallest unit in which the information is stored. Unix file
system has several important features. All data in Unix is organized into files. All files are organized
into directories. These directories are organized into a tree-like structure called the file system.
Files in Unix System are organized into multi-level hierarchy structure known as a directory tree. At
the very top of the file system is a directory called “root” which is represented by a “/”. All other files
are “descendants” of root.
/ : The slash / character alone denotes the root of the filesystem tree.
/bin : Stands for “binaries” and contains certain fundamental utilities, such as ls or cp, which are
generally needed by all users.
/boot : Contains all the files that are required for successful booting process.
/dev : Stands for “devices”. Contains file representations of peripheral devices and pseudo-
devices.
/etc : Contains system-wide configuration files and system databases. Originally also contained
“dangerous maintenance utilities” such as init,but these have typically been moved to /sbin or
elsewhere.
/home : Contains the home directories for the users.
/lib : Contains system libraries, and some critical files such as kernel modules or device drivers.
/media : Default mount point for removable devices, such as USB sticks, media players, etc.
/mnt : Stands for “mount”. Contains filesystem mount points. These are used, for example, if the
system uses multiple hard disks or hard disk partitions. It is also often used for remote (network)
filesystems, CD-ROM/DVD drives, and so on.
/proc : procfs virtual filesystem showing information about processes as files.
/root : The home directory for the superuser “root” – that is, the system administrator. This
account’s home directory is usually on the initial filesystem, and hence not in /home (which may
be a mount point for another filesystem) in case specific maintenance needs to be performed,
during which other filesystems are not available. Such a case could occur, for example, if a hard
disk drive suffers physical failures and cannot be properly mounted.
/tmp : A place for temporary files. Many systems clear this directory upon startup; it might have
tmpfs mounted atop it, in which case its contents do not survive a reboot, or it might be explicitly
cleared by a startup script at boot time.
/usr : Originally the directory holding user home directories,its use has changed. It now holds
executables, libraries, and shared resources that are not system critical, like the X Window
System, KDE, Perl, etc. However, on some Unix systems, some user accounts may still have a
home directory that is a direct subdirectory of /usr, such as the default as in Minix. (on modern
systems, these user accounts are often related to server or system use, and not directly used by
a person).
/usr/bin : This directory stores all binary programs distributed with the operating system not
residing in /bin, /sbin or (rarely) /etc.
/usr/include : Stores the development headers used throughout the system. Header files are
mostly used by the #include directive in C/C++ programming language.
/usr/lib : Stores the required libraries and data files for programs stored within /usr or elsewhere.
/var : A short for “variable.” A place for files that may change often – especially in size, for
example e-mail sent to users on the system, or process-ID lock files.
/var/log : Contains system log files.
/var/mail : The place where all the incoming mails are stored. Users (other than root) can access
their own mail only. Often, this directory is a symbolic link to /var/spool/mail.
/var/spool : Spool directory. Contains print jobs, mail spools and other queued tasks.
/var/tmp : A place for temporary files which should be preserved between system reboots.
Types of Unix files – The UNIX files system contains several different types of files :
1. Ordinary files – An ordinary file is a file on the system that contains data, text, or program
instructions.
Used to store your information, such as some text you have written or an image you have drawn.
This is the type of file that you usually work with.
Always located within/under a directory file.
Do not contain other files.
In long-format output of ls -l, this type of file is specified by the “-” symbol.
2. Directories – Directories store both special and ordinary files. For users familiar with Windows
or Mac OS, UNIX directories are equivalent to folders. A directory file contains an entry for every file
and subdirectory that it houses. If you have 10 files in a directory, there will be 10 entries in the
directory. Each entry has two components.
(1) The Filename
(2) A unique identification number for the file or directory (called the inode number)
contain “real” information which you would work with (such as text). Basically, just used for
organizing files.
In long-format output of ls –l , this type of file is specified by the “d” symbol.
3. Special Files – Used to represent a real physical device such as a printer, tape drive or
terminal, used for Input/Output (I/O) operations. Device or special files are used for device
Input/Output(I/O) on UNIX and Linux systems. They appear in a file system just like an ordinary file or
a directory.
On UNIX systems there are two flavors of special files for each device, character special files and
block special files :
When a character special file is used for device Input/Output(I/O), data is transferred one
character at a time. This type of access is called raw device access.
When a block special file is used for device Input/Output(I/O), data is transferred in large fixed-
size blocks. This type of access is called block device access.
For terminal devices, it’s one character at a time. For disk devices though, raw access means reading
or writing in whole chunks of data – blocks, which are native to your disk.
In long-format output of ls -l, character special files are marked by the “c” symbol.
In long-format output of ls -l, block special files are marked by the “b” symbol.
4. Pipes – UNIX allows you to link commands together using a pipe. The pipe acts a temporary file
which only exists to hold data from one command until it is read by another.A Unix pipe provides a
one-way flow of data.The output or result of the first command sequence is used as the input to the
second command sequence. To make a pipe, put a vertical bar (|) on the command line between two
commands.For example: who | wc -l
In long-format output of ls –l , named pipes are marked by the “p” symbol.
5. Sockets – A Unix socket (or Inter-process communication socket) is a special file which allows
for advanced inter-process communication. A Unix Socket is used in a client-server application
framework. In essence, it is a stream of data, very similar to network stream (and network sockets),
but all the transactions are local to the filesystem.
In long-format output of ls -l, Unix sockets are marked by “s” symbol.
6. Symbolic Link – Symbolic link is used for referencing some other file of the file
system.Symbolic link is also known as Soft link. It contains a text form of the path to the file it
references. To an end user, symbolic link will appear to have its own name, but when you try reading
or writing data to this file, it will instead reference these operations to the file it points to. If we delete
the soft link itself , the data file would still be there.If we delete the source file or move it to a different
location, symbolic file will not function properly.
In long-format output of ls –l , Symbolic link are marked by the “l” symbol (that’s a lower case L).
UNIX V
Z File System (ZFS)
The Z File System (ZFS) is an open-source logical volume manager and file system created by Sun
Microsystems, originally for its Solaris operating system. It is now used in many operating systems
including FreeBSD, NetBSD, Mac OS X Server 10.5 and various Linux distributions through ZFS-
FUSE. The most distinguishing feature of ZFS is pooled storage, where multiple storage devices are
treated as one big pool rather than as separate devices and logical drives. Storage can be taken from
the pool and allocated to other file systems, and the pool can be increased by adding new storage
devices to the pool. This is the same method of resource allocation used in a multitenant cloud
environment.
Data integrity — Checksum is always written with data and is calculated again when those
data are read back. If there is a mismatch in the checksum, which indicates an error, then
ZFS attempts to automatically correct the error if data redundancy is available (backups).
Pooled storage — All storage devices are added to a pool, which can be allocated to other file
systems or returned. This makes it easier to manage since a single pool is simpler than
multiple physical and logical drives. To increase the pool, new storage devices can be added.
Performance — Performance is increased by employing multiple caching mechanisms. ZFS
uses an adaptive replacement cache (ARC), which is an advanced memory-based read
cache, along with a second L2ARC, which can be added when needed, and a disk-based
synchronous write cache, which is available through ZIL (ZFS intent log).
With multiple BEs, the process of updating software becomes a low-risk operation because system
administrators can create backup BEs before making any software updates to their system. If needed,
they have the option of booting a backup BE.\
You do not have to create a backup BE as a separate step if you are updating IPS packages. When
you use the pkg install or pkg update command, use the –require-backup-be, –backup-be-name, –
be-name, or –require-new-be option to make the changes in a new boot environment, not in the
current boot environment.
where beName is a variable for the name of the new boot environment. This new boot
environment is inactive.
Note - The beadm create command does not create a partial boot environment. Either a new, full
boot environment is successfully created, or the command fails.
If the directory for the mount point does not exist, the beadm command creates the directory,
then mounts the boot environment on that directory. If the boot environment is already mounted,
the beadm mount command fails and does not remount the boot environment at the new location.
The boot environment is mounted, but remains inactive. Note that you can upgrade a mounted,
inactive boot environment. Also, remember to unmount the boot environment before rebooting
your system.
4. (Optional) To boot from the new boot environment, first activate the boot environment.
where beName is a variable for the name of the boot environment to be activated. Upon reboot,
the newly active boot environment becomes the default boot entry that is listed in the GRUB
menu.
Snapshots
Snapshots of file systems are accessible in the . zfs/snapshot directory within the root of the file
system. For example, if tank/home/ahrens is mounted on /home/ahrens , then the
tank/home/ahrens@thursday snapshot data is accessible in the /home/ahrens/. zfs/snapshot/thursday
directory.
You can enable or disable the display of snapshot listings in the zfs list output by using
the listsnapshots pool property. This property is enabled by default.
If you disable this property, you can use the zfs list -t snapshot command to display snapshot
information. Or, enable the listsnapshots pool property. For example:
Service startup
The Advanced Boot Options screen lets you start Windows in advanced troubleshooting modes. You
can access the menu by turning on your computer and pressing the F8 key before Windows starts.
Some options, such as safe mode, start Windows in a limited state, where only the bare essentials
are started. If a problem doesn't reappear when you start in safe mode, you can eliminate the default
settings and basic device drivers and services as possible causes. Other options start Windows with
advanced features intended for use by system administrators and IT professionals. For more
information, go to the Microsoft website for IT professionals.
Shows a list of system recovery tools you can use to repair startup problems, run diagnostics, or
restore your system. This option is available only if the tools are installed on your computer's hard
disk. If you have a Windows installation disc, the system recovery tools are located on that disc.
Safe Mode
1. Remove all floppy disks, CDs, and DVDs from your computer, and then restart your computer. Click
the Start button , click the arrow next to the Shut Down button (or the arrow next to the Lock button),
and then click Restart.
2. Do one of the following:
If your computer has a single operating system installed, press and hold the F8 key as your computer
restarts. You need to press F8 before the Windows logo appears. If the Windows logo appears, you'll
need to try again by waiting until the Windows logon prompt appears, and then shutting down and
restarting your computer.
If your computer has more than one operating system, use the arrow keys to highlight the operating
system you want to start in safe mode, and then press F8.
3. On the Advanced Boot Options screen, use the arrow keys to highlight the safe mode option you
want, and then press Enter.
4. Log on to your computer with a user account that has administrator rights.
Safe Mode with Networking. Starts Windows in safe mode and includes the network drivers and
services needed to access the Internet or other computers on your network.
Safe Mode with Command Prompt. Starts Windows in safe mode with a command prompt window
instead of the usual Windows interface. This option is intended for IT professionals and
administrators.
Enable Boot Logging. Creates a file, ntbtlog.txt, that lists all the drivers that are installed during
startup and that might be useful for advanced troubleshooting.
Enable low-resolution video (640×480). Starts Windows using your current video driver and using
low resolution and refresh rate settings. You can use this mode to reset your display settings. For
more information, see Change your screen resolution.
Last Known Good Configuration (advanced). Starts Windows with the last registry and driver
configuration that worked successfully.
Directory Services Restore Mode. Starts Windows domain controller running Active Directory so
that the directory service can be restored. This option is intended for IT professionals and
administrators.
Debugging Mode. Starts Windows in an advanced troubleshooting mode intended for IT
professionals and system administrators.
Disable automatic restart on system failure. Prevents Windows from automatically restarting if an
error causes Windows to fail. Choose this option only if Windows is stuck in a loop where Windows
fails, attempts to restart, and fails again repeatedly.
Disable Driver Signature Enforcement. Allows drivers containing improper signatures to be
installed.
Start Windows Normally. Starts Windows in its normal mode.
Dependencies
A dependency is a broad software engineering term used to refer when software relies on other
software in order to be functional. For R package dependencies, the installation of
the synapser package will take care of installing other R packages that synapser depends on, using
dependencies specified in the DESCRIPTION file. However, lower level system dependencies are not
automatically installed.
Most Windows and Mac machines have the required system dependencies. Linux machines,
including most Amazon Web Services EC2 machines, will need to be configured before
installing synapser.
libssl-dev
libcurl-dev
libffi-dev
zlib-dev
Ubuntu Installation
To install these system dependencies on Ubuntu machines:
apt-get update -y
apt-get install -y dpkg-dev zlib1g-dev libssl-dev libffi-dev
apt-get install -y curl libcurl4-openssl-dev
Another option is to use the provided Dockerfile. For more information on installing synapser with
Docker, please see our Docker vignettes.
Redhat Installation
To install these system dependencies on Redhat machines:
yum update -y
yum install -y openssl-devel curl-devel libffi-devel zlib-devel
yum install -y R
Management
A management operating system (MOS) is the set of tools, meetings and behaviours used to manage
your people and processes to deliver results. A Management Operating System (MOS) follows the
Plan, Do, Check, Act improvement cycle to get control and steadily improve process performance.
The tools and work practices of a Management Operating System enable Front Line Leaders to
detect and correct mistakes before they become serious and helps to minimize wastage and loss.
Awareness of the performance of the process at short intervals enables corrective actions to be taken
and ensures efficiency by monitoring the allocation and use of resources.
Effective management operating systems improve operational performance, however most systems
are built using clunky spreadsheets and whiteboards that are hard to implement and sustain. Fewzion
changes this by putting an easy to use, visual system in the hands of planners and front line leaders
so that they can get control of their processes and improve performance one shift at a time.
Plan the Work
Fewzion helps you build an effective plan by connecting the work you want to do and the targets you
want to meet with the resources and equipment you need to get the plan done. Fewzion has roster
and equipment planning tools in our shiftly planning system to ensure plans can be resource balanced
easily.
Work (Do) the Plan
Shift managers and supervisors should adapt their plan for changing conditions then take it with then
on a tablet or print it (including attachments such as work orders and tool box talks) to communicate
and execute with their crew during the shift.
Check Performance for Accountability
Throughout and at the end of shift, supervisors complete their “Actuals” touch screens with KPI
results, tasks they have completed and answer any critical questions. Shift Managers should discuss
performance with their supervisor using these screens to close the “leadership loop”
Report and Improve
Each day the gathered data is used to provide management with daily and weekly operating
reports about production and KPI performance for use in review meetings. We also send reports on
system usage and tie and attendance to help identify issues with the behaviours you should expect
from your frontline leaders.
Fewzion is a web based Management Operating System (MOS) that replaces the cluster of
paperwork, spreadsheets and whiteboards used in shift planning, rosters, leave and equipment
scheduling and performance reporting to simplify your operational management. Fewzion is fast to get
going and simple to use making it easier for your planners, managers and front line leaders to safely
improve production and performance.
From weekly and daily planning to handovers, short interval control to daily review meetings Fewzion
covers off the critical elements of your Management System so you can sustainably Plan, Do, Check
& Act your way to improved results with a better Management Operating System.
Systems update
Operating system updates contain new software that helps keep your computer current.
Examples of updates include service packs, version upgrades, security updates, drivers, or other
types of updates.
Important and high-priority updates are critical to the security and reliability of your computer. They
offer the latest protection against malicious online activities.
You need to update all of your programs, including Windows, Internet Explorer, Microsoft Office, and
others. Visit Microsoft Update to scan your computer and see a list of updates, which you can then
decide whether to download and install.
NOTE: Microsoft offers security updates on the second Tuesday of the month.
It's important to install new security updates as soon as they become available.
The easiest way to do this is to turn on automatic updating and use the recommended setting, which
downloads recommended updates and installs them on a schedule you set.
In Windows Vista, you control the automatic updating settings through the Windows Update Control
Panel. For more information, see Turn automatic updating on or off.
Scope of Module This module introduces the students to the concepts involved in implementing
network stacks (the software and firmware that implement a computer networking protocol suite such
as TCP/IP over Ethernet).
The aim of the module is to introduce students to the software embedded in network devices such as
routers to implement network protocols. Where possible, open source implementations of protocols
used in live networks will be studied. Both the data plane and the control plane will be studied,
including data-link layer protocols, network layer protocols and transport layer protocols. Optimisation
techniques, hardware acceleration and other approaches to achieving “wire speed” operation will be
investigated. Protocols appropriate to the Internet of Things, to data centres, and to the future Internet
will be considered. Having successfully completed this course, the students will be able to:
classify network functionality as belonging to the control plane and the data plane respectively
explain how a typical operating system processes packets from arrival from an interface card
to forwarding to user space
describe the principles involved in implementing a network stack in software
decompose the software of “middleboxes” such as network routers into a software
architecture
evaluate the trade-offs involved in hardware versus software implementation of packet
processing functions
demonstrate advanced theoretical knowledge of networking
add functionality to an open-source network stack
adapt existing software to meet new networking requirements
The module involves exploring the lower layers (the transport layer and below) and also the control
plane (which straddles the layers) of protocol stacks. Conventional courses on network programming
treat the network as something neatly wrapped inside an Applications Programming Interface such as
the BSD sockets API. In this module, we look “under the hood” at the software that lies below this
interface.
We’ll be using the Linux implementation of the network stack as our reference for how an Operating
System implements network protocols. Ideally we’d study a real-time OS or one used in commercial
routers, but these are, unfortunately, not open-source. For the same reason, we will not be looking at
how the lower layers are implemented in the market leader OS.
Disclaimer
This is not a training course on Linux kernel software development; it is an academic module about
the principles involved in implementing network stacks, which is illustrated by examples drawn largely
from the Linux kernel. The Linux kernel is updated frequently, and many optimisations and bug fixes
obscure the principles involved. If a concept is clearly illustrated in vX.Y of the kernel, and obfuscated
in later versions, the concept will continue to be illustrated using the vX.Y source code.
The flow of the packet through the linux network stack is quite intriguing and has been a topic for
research, with an eye for performance enhancement in end systems. This document is based on the
TCP/IP protocol suite in the linux kernel version 2.6.11 - the kernel core prevalent at the time of
writing this document. The sole purpose of this document is to take the reader through the path of a
network packet in the kernel with pointers to LXR targets where one can have a look at the functions
in the kernel which do the actual magic.
This document can serve as a ready look up for understanding the network stack, and its discussion
includes KURT DSKI instrumentation points, which are highly useful in monitoring the packet behavior
in the kernel.
We base our discussion on the scenario where data is written to a socket and the path of the resulting
packet is traced in a code walk through sense
TCP/IP - Overview
TCP/IP is the most ubiquitous network protocol one can find in today’s network. The protocol has its
roots in the 70’s even before the formulation of the ISO OSI standards. Therefore, there are four well
defined layers in the TCP/IP protocol suite which encapsulate the popular seven layered architecture,
within it.
Relating TCP/IP to the OSI model - The application layer in the TCP/IP protocol suite
comprises of the application, presentation and the sessions layer of the ISO OSI model.
The socket layer acts as the interface to and from the application layer to the transport layer. This
layer is also called as the Transport Layer Interface. It is worth mentioning that there are two kinds
of sockets which operate in this layer, namely the connection oriented (streaming sockets) and the
connectionless (datagram sockets).
The next layer which existsin the stack is the Transport Layer which encapsulates the TCP and UDP
functionality within it. This forms Layer 4 of the TCP/IP 3 protocol stack in the kernel. The Network
Layer in the TCP/IP protocol suite is called IP layer as this layer contains the information about the
Network topology, and this forms Layer 3 of the TCP/IP protocol stack. This layer also understands
the addressing schemes and the routing protocols.
Link Layer forms Layer 2 of the stack and takes care of the error correction routines which are
required for error free and reliable data transfer.
The last layer is the Physical Layer which is responsible for the various modulation and electrical
details of data communication.
NOTE: All bold faced text are LXR search strings and the corresponding files are mostly mentioned
alongside each LXR target. In the event of file names not being mentioned, an identifier search with
the LXR targets will lead to the correct location which is in context.
I would like to briefly explain the structure of Linux Netfilter architecture, How it works and how does
packet flow through Linux machine.
What is a Firewall?
A firewall is a device software or hardware which is used to filter out the packets going through the
network on the basis of some rules and policies.
The firewall has two components one is packet filtering and the second is an application-level gateway.
Both of these technologies used to filter out packets depending on packet header and payload
information. Packet filter works up to layer 4 (Transport Layer) in the TCP/IP model. Additionally, if we
wanna filter out the packet on the basis of payload or data then Application-level gateway is used. I
don't wanna go into details of both this article is just to understand packet filters architecture and how it
works. Please look at ipv4 header, TCP header, UDP header and ARP header on
Wikipedia. https://en.wikipedia.org/wiki/IPv4
How Does Packet Filter work?
packet filter is a component of a firewall that is used to filter out packets on specified rules. Packet filter
takes the packet and matches with specified rules in iptables (program provided by Linux kernel firewall
) , Then the header information of a packet is compared with features specified in rules. If header
properties of the packet do match with Rule features then corresponding actions triggered for a
particular rule. Remeber again each rule has two sub-components features that have all the
information that is compared against packet header information and the second component is the
target in which actions are specified e.g drop the packet, send the packet to another rule chain or
accept the packet. All rules work in linear order keep in mind order is so crucial. Default rule works
such a way if the above rule does not accept packets then default accepts, or another way around but
what to accept or deny that should be manually specified.
what is a Packet?
A packet is a chunk of information that flows through the internet. Additionally, a packet contains all the
information that is important for intermediate stations to get to the destination point. e.g packet header.
For example, a TCP IP packet may have sender IP address, receiver IP address, sender port number,
receiver port number and protocol to which on receiving side the packet would be handed over.
Netfilter Architecture is an indispensable component of firewall. I would briefly explain here what are
chains and how these chains work.
There are five chains, but we are only concerned to know about three. These are so crucial for getting
into this topic. Whenever a packet comes to the machine or PC, there is a NIC card through which
network traffic goes in or out.
as depicted above in figure 1.3. when TCP IP packet comes to network interface card then that is sent
to the Pre Routing chain where the decision is made either the packet is destined for the local process,
or for another router or another interface depending upon packet header information, The decision is
made for routing the packet.
1. INPUT CHAIN: If the packet is destined for local process (process means the execution of code
at run time) so remember the local process could be any application interacting with the network. I
am considering, for example, an application running on port 80.
2. OUTPUT CHAIN: If the packet is generated through the local process and intended to go to
another machine or network or router etc. That packet will flow through OUTPUT CHAIN and then
POSTROUTING chain and then handed over to the network interface card.
3. FORWARD CHAIN: if packet comes through network interface and then the decision is made
either the packet is intended for local machine or is for another network interface or in another
words packet is for another machine or router then the packet goes through FORWARD CHAIN
and then sent to POSTROUTING and lastly to NIC.
I hope this would be useful to get to know about Netfilter architecture and how it works. For practical
purpose go through “sudo iptables -v -L” in Linux you will be able to see all these three chains and play
around by creating client-server machines.