A mapping guide for complex, multithreaded, multiprocess applications
The wave of migration to open source in business has the potential to cause a tremendous porting traffic jam as developers move the pervasive Windows® applications to the Linux platform™. In this three-part series, get a mapping guide, complete with examples, to ease your transition from Windows to Linux. Part 1 introduces processes and threads.
Today many global businesses and services are going open source — all the major corporate players in the industry are pushing for it. This trend has spurred a major migration exercise in which lots of existing products maintained for various platforms (Windows, OS2, Solaris, etc.) will be ported to open source Linux platforms.
Many applications are designed without considering the need to port them to Linux. This has the potential to be a porting nightmare, but it doesn’t have to be. The goal of this series of articles is to help you migrate complex applications involving IPC and threading primitives from Windows to Linux. We share our experiences in moving these critical Windows IPC applications, applications that include multithreaded apps that require thread synchronization and multiprocess apps that require interprocess synchronization.
In short, think of this series as a mapping document — it provides mapping of various Windows calls to Linux calls related to threads, processes, and interprocess communication elements (mutexes, semaphores, etc.). We’ve divided the mapping into three chunks:
- Part 1 deals with processes and threads.
- Part 2 handles semaphores and events.
- Part 3 covers mutexes, critical sections, and wait functions.
Basic execution units in Windows and Linux are different. In Windows, the thread is the basic execution unit, and the process is a container that holds this thread.
In Linux, the basic execution unit is the process. The functionalities offered by Windows APIs can be mapped directly to Linux system calls:
The Classification column (which explains classification constructs used in this article) indicates whether the Windows construct is mappable or context specific:
- If mappable, the Windows construct can be mapped to the specified Linux construct(s) by closely examining the types, parameters, return codes, and such. Both the Windows and Linux constructs provide similar functionality.
- If context specific, the given Windows construct may or may not have an equivalent construct in Linux, or Linux may have more than one construct that provides similar functionality. In either case, the decision to use a specific Linux construct(s) depends on the application context.
Creating a process
In Windows, you can use
CreateProcess() to create a new process. The
CreateProcess() function creates a new process and its main thread as follows:
bInheritHandles determines whether the handles have to be inherited to the child from the parent.
lpCommandLine give the name and path of the process to be started.
lpEnvironment defines the environment that has to be visible for the process.
In Linux, the
exec* family of functions replace the current process image with a new process image (as shown in the following):
These versions of
exec* are just various calling interfaces for core function
int execve(const char *filename, char *const argv , char *const envp). Here
argv is the pointer containing arguments
envp is the pointer containing list of environment variables which are basically key=value pairs.
This must be used along with the
fork() command so both the parent and child processes are running:
fork() creates a child process that differs from the parent process only in its PID and PPID; in fact, the resource utilizations are set to 0.
By default, the
exec() function inherits the group and user IDs from the parent process, which makes it dependent on the parent process. This can be changed by:
- Setting the
set-gidbit on the program file pointed
- Using the
CreateProcessAsUser() function is similar to
CreateProcess() except that the new process runs in the security context of the user represented by the
hToken parameter. There is no one-to-one equivalent for this function in Linux, but it can be replicated using the following logic:
fork()to create a new child process with new PID
setuid()to switch to the new PID
exec()to change the existing process image with the process to execute
Terminating a process
To forcibly terminate a running process, you can use
TerminateProcess() in Windows.
This function terminates the running process and all the associated threads. Use this function only in extreme scenarios.
In Linux, you can use
kill() to forcibly kill a process:
int kill(pid_t pid, int sig). This system call terminates the process of id PID. You can also use it to signal to any group or process.
Using wait functions
In cases when the child process is dependent on the parent process, you can use wait functions in parent process to wait for the child process termination. In Windows, you can use the
WaitForSingleObject() function call to achieve this.
You can use the
WaitForMultipleObject() function to wait for more than one object.
You can populate the object-handle array with the number of objects to wait for. Based on the
bWaitALL option, you can either wait for all the objects to be signaled or wait for any of them to be signaled.
In both of these functions, if you want to wait for a finite time, you can specify the time interval in the second parameter. If you want to wait infinitely, use
INFINITE as the value for
dwMilliseconds to 0 will just test the state of the object and return.
You can use
waitpid() in Linux if you want to just wait infinitely for the process to die. In Linux, there is no way to do a timed wait on a
In this code:
pid_t waitpid(pid_t pid, int *status, int options),
waitpid() infinitely waits for the child process to terminate. Wait functions, in both Windows and Linux, suspend the execution of the current process until it completes, but in Windows there is an option to exit by specifying a time value. You can implement a timed wait or NO WAIT functionality similar to
WaitForMultipleObject() using System V semaphores, which is discussed in Part 2 of this series. Part 3 of this series further discusses wait functions.
Exiting a process
Exiting a process means a graceful exiting of the process with a proper cleanup. In Windows, you use
ExitProcess() to perform this operation.
ExitProcess() is the preferred method of ending a process. This function provides a clean process shutdown. This includes calling the entry-point function of all attached dynamic-link libraries (DLLs) with a value indicating that the process is detaching from the DLL.
The Linux equivalent for
void exit(int status);.
exit() function causes normal program termination and the value of status &0377 is returned to the parent. The C standard specifies two definitions (
EXIT_FAILURE) that can be passed to the status parameter to indicate successful or unsuccessful termination.
Each process has an environment block associated with it, basically name=value pairs that specify various environments the process can access. Even though we can specify the environment when we create the process, there are also specific functions to set and obtain environment variables after the process is created.
In Windows, you can use
SetEnvironmentVariable() to get and set the environment variables.
This function returns the size of the value buffer on success and 0 if the name specified is not a valid environment variable name. The
SetEnvironmentVariable() function sets the contents of the specified environment variable for the current process.
If the function succeeds, the return value is non-zero. If the function fails, the return value is zero.
setenv() system calls provide the equivalent functionality.
getenv() function searches the environment list for a string that matches the string pointed to by name. This function returns a pointer to the value in the environment or NULL if there is no match. The
setenv() function adds the variable name to the environment with the value if the name does not already exist. If the name does exist in the environment, then its value is changed to
value if overwrite is non-zero. If overwrite is zero, then the value of
name is not changed. The
setenv() function returns zero on success or -1 if there was insufficient space in the environment.
The following examples illustrate what we’ve discussed in this section. Listing 1. Windows process code
In Windows, the thread is the basic unit of execution. One or more threads run in the context of the process. The scheduling code is implemented in the kernel. There is no single "scheduler" module or routine.
The Linux kernel uses a process model rather than a threading model. The Linux kernel provides a lightweight process framework for creating threads; the actual thread implementation is in the user space. There are various threading libraries available (LinuxThreads, NGPT, NPTL, and so on) in Linux. The information in this article is based on the LinuxThreads library, but the information here is also applicable to Red Hat’s Native POSIX Threading Library (NPTL).
This section describes threading in Windows and in Linux. It covers the calls for creating a thread, setting its attributes, and changing its priority.
Creating a thread
In Windows, you can use
CreateThread() to create a thread to execute under the virtual address space of the calling process.
lpThreadAttributes is a pointer to the thread attributes that determines whether the thread handle can be inherited by the child process.
Linux uses the pthread library call
pthread_create() to spawn a thread:
Note: In Windows, the number of threads a process can create is limited by the available virtual memory. By default, every thread has one megabyte of stack space. Therefore, you can create at most 2,028 threads. If you reduce the default stack size, you can create more threads. In Linux, the maximum number of process per user can be found using
ULIMIT -a (limits for all users), and you can update it by using
ULIMIT -u, but it would be valid only for that logon. The header files under /usr/Include/limit.h and ulimit.h define these constants. You can modify them and recompile kernel to hv permanent effect. For POSIX threadlimits, the
THREAD_THREADS_MAX macro defines the maximum limit and is defined in local_lim.h.
Specifying the thread function
lpStartAddress in the
CreateThread() is the address of the function that the newly created thread will execute.
start_address for the Linux library call
pthread_create() is the address of the function that the newly created thread will execute.
Parameter passing to the thread function
In Windows, the parameter
lpParameter for the system call
CreateThread() specifies the parameter to be passed to the newly created thread. It specifies the address of the data item to be passed to the new thread.
In Linux, the parameter
arg for the library call
pthread_create() specifies the parameter to be passed to the new thread.
Setting the stack size
In Windows, the parameter
dwStackSize for the
CreateThread() is the size of stack in bytes that is to be allocated for the new thread. The stack size should be a non-zero multiple of 4 KB and a minimum of 8 KB.
In Linux, the stack size is set in the pthread attributes object; that is, the parameter
threadAttr of type
pthread_attr_t is passed to the library call
pthread_create(). This object needs to be initialized by the call
pthread_attr_init() before any attributes are set. The attribute object is destroyed using the call
Note that all of the
pthread_attr_setxxxx calls achieve similar functionality to the
pthread_xxxx calls (if available) except that you can use
pthread_attr_xxxx only before thread creation to update the attribute object that will be passed as a parameter to
pthread_create. Meanwhile, you can use
pthread_xxxx calls at any time after the thread has been created.
The stack size is set using the call
int pthread_attr_setstacksize(pthread_attr_t *threadAttr, int stack_size);.
Exiting a thread
In Windows, the system call
ExitThread() terminates the thread. The
dwExitCode is the return value of the thread, and it can be retrieved from another thread by calling
The Linux equivalent for this is the library call
retval is the return value of the thread, and you can retrieve it from another thread by calling
int pthread_exit(void* retval);.
In Windows, there are no explicit thread states maintained with respect to thread termination. However,
WaitForSingleObject() allows a thread to wait explicitly on the termination of a specific or non-specific thread within the process.
In Linux, threads are by default created in joinable state. In joinable state, another thread can synchronize on the thread’s termination and recover its termination code using the function
pthread_join(). The thread resources of the joinable thread are released only after it is joined.
WaitForSingleObject() to wait for a thread to terminate:
hHandleis the pointer to the thread handle.
dwMillisecondsis the time out value in milliseconds. If the value is set to INFINITE, then it blocks the calling thread/process indefinitely.
pthread_join() to do the same:
int pthread_join(pthread_t *thread, void **thread_return);.
In the detached state, the thread resources are immediately freed when it terminates. The detached state can be set by calling
pthread_attr_setdetachstate() on the thread attribute object:
int pthread_attr_setdetachstate (pthread_attr_t *attr, int detachstate);. A thread created in a joinable state can later be put into a detached state using the
int pthread_detach (pthread_t id);.
In Windows, the priority of the thread is determined by the priority class of its process and the priority level of the thread within the priority class of the process. In Linux, the thread itself is the unit of execution and has its own priority. It has no dependency on the priority of its process.
In Windows, you can use
SetPriorityClass() to set the priority class for the specified process:
dwPriorityClass is the priority class of the process, and it is set to any of the following values:
Once the priority class of the process is set,
SetThreadPriority() is used to set the priority level of the thread within the priority class of the process:
nPriority is the priority value of the thread, and it is set to one of the following values:
THREAD_PRIORITY_ABOVE_NORMALsets the priority to 1 point above the priority class.
THREAD_PRIORITY_BELOW_NORMALsets the priority to 1 point below the priority class.
THREAD_PRIORITY_HIGHESTsets the priority to 2 points above the priority class.
THREAD_PRIORITY_IDLEsets base priority to 1 for
HIGH_PRIORITY_CLASSprocesses, and sets base priority to 16 for
THREAD_PRIORITY_LOWESTsets the priority to 2 points below the priority class.
THREAD_PRIORITY_NORMALsets to normal priority for the priority class.
THREAD_PRIORITY_TIME_CRITICALsets the base priority to 15 for
HIGH_PRIORITY_CLASSprocesses, and sets base priority to 31 for
Examples of processes and threads
To wrap up this installment, let’s look at some examples of the following types of processes and threads:
- Normal or regular processes and threads
- Time-critical and real-time processes and threads
Normal or regular processes/threads
The Linux system call
setpriority() is used to set or modify priority levels for normal processes and threads. The parameter scope is
PRIO_PROCESS. Set id to 0 to change the current process (or thread) priority. Again, delta is the priority value — this time in the range -20 to 20. Note also that in Linux, a lower delta value means a higher priority. So you set +20 for
IDLETIME priority and 0 for
In Windows, the priority range is from 1 (lower priority) to 15 (higher priority) for the regular threads. But in Linux, the priority range for normal non-real-time processes is from -20 (higher) to +20 (lower priority). This has to be mapped before being used:
int setpriority(int scope, int id, int delta);.
Time-critical and real-time processes and threads
You can use the Linux system call
sched_setscheduler() to change the scheduling priority and policy of a running process:
int sched_setscheduler(pit_t pid, int policy, const struct sched_param *param);.
The parameter policy is the scheduling policy. The possible values for policy are
SCHED_OTHER (for regular non-real-time scheduling),
SCHED_RR (real-time round-robin policy), and
SCHED_FIFO (real-time FIFO policy).
param is a pointer to a structure representing scheduling priority. It can range from 1 to 99 only for real-time policies. For others (normal non-real-time processes), it is zero.
In Linux, for a known scheduling policy, it is also possible to change only the process priority by using the system call
int sched_setparam(pit_t pid, const struct sched_param *param);.
The LinuxThreads library call
pthread_setschedparam is the thread version of
sched_setscheduler and is used to dynamically change the scheduling priority and policy for a running thread:
int pthread_setschedparam(pthread_t target_thread, int policy, const struct sched_param *param);.
target_thread indicates the thread whose priority is to be changed;
param indicates the priority.
The LinuxThreads library calls
pthread_attr_setschedpolicy, and you can use
pthread_attr_setschedparam to set the scheduling policy and the priority level to the thread attribute object before the thread is created:
In Windows, the priority range is from 16 (lower priority) to 31 (higher priority) for the real-time threads. In Linux, the priority range for real-time threads is from 99 (higher) to 1 (lower priority). This has to be mapped before being used.
The following listings illustrate the concepts in this section.
Next in the series
This first part of the series has given you a guide to help map Windows processes and threads to their functional counterparts in Linux. Part 2 in the series covers synchronization objects and primitives, starting with semaphores and events. Part 3 covers mutexes, critical sections, and wait functions.