Linux 多线程编程

发布日期：2021-07-01 04:36:52 浏览次数：2 分类：技术文章

本文共 62298 字，大约阅读时间需要 207 分钟。

线程---参考《Linux C程序设计大全》的线程部分

线程与进程

线程和进程是一对容易混淆的概念，事实上，多数系统在内核的实现中对二者不加以区分，但是，进程和线程还是有很多到不同之处，进程之间是相互分开到实体，从进程的通信可以就看出，而线程可以共享线程所在进程中的数据。所以，线程之间的通信比较容易实现，同时，线程之间的互斥协调也比较容易实现。

进程之间的通信可以参考《Linux - 进程 - 进程间通信》，而线程之间的通信和同步操作，可以使用互斥量，信号量等机制来控制。线程之间可以共享主线程（就是进程）的数据。主要是堆数据。

对于栈上的数据，每个线程之间是独立的。因为，每个线程有自己的独立线程栈，所以，线程直接操作的栈数据相互独立，可以参考《Linux - 进程 - 线程（重要）可重入问题》就知道一个函数的参数，参数是一个局部变量，存放在线程自己的栈中，不相互干扰。

15.1.1线程的概念

进程是一个执行实体，操作系统以进程为单位分配资源，在一个执行空间内可以使用多个小型进程并发地完成不同到任务，这种小型的进程称为：线程。

一个进程中到每个线程都有自己的环境上下文，包括线程ID，一组寄存器的值，堆栈，信号屏蔽字等。进程的所有资源都被各线程共享，包括可以执行程序代码，全局变量，堆空间，栈空间，文件描述符等。

那么，引入线程的概念之后，操作系统中的执行实体不再是进程，而是线程了，进程只是一个用来分配资源到实体，而真正负责执行的则是线程，单进程可以看作是只有一个线程的进程。目前Linux内核则是以轻量级进程的方式来实现多线程。内核每个中的轻量级进程对应用户空间一个的线程，轻量级进程也有自己的进程控制结构，也是一个进程调度的单位。轻量级进程和普通进程的区别是多个轻量级进程共享某些资源，例如地址空间等，但是轻量级进程的内核中使用的堆栈是独立的。

15.1.2线程的优势

提出线程模型的初衷是为了提高并行性，如果在同一个进程空间内同时并发多个线程，则程序的执行效率会大大地提高，因此，总结线程模型的优势如下：

1 由于线程共享进程地址空间内的所有资源，所以，线程之间的通信是很方便的，同样的任务如果采用多进程的编程模型，就必须使用操作系统提供的进程间的通信方式，例如管道，消息队列，共享内存等等。其效率和程序设计的复杂度都受到很大的影响。使用多线程编程则可以避免这些问题。多个执行任务的线程协调起来很方便，提高了效率也降低了编程的复杂度。

2 多个线程处理不同的任务，增加了程序的并发性，使程序更高效地执行。例如，在交互式程序中创建一个单独的线程接收用户输入的命令，并创建另一个线程对这些命令进行处理，浏览器程序就采用这种方式，一个线程处理用户的输入，一个线程负责显示请求站点发回的数据，剩下的多个线程分别接收不同的数据包。

线程的标识符

同进程一样，每个线程都有自己的ID号，使用数据类型 pthread_t 来表示，同进程ID的数据类型 pid_t 一样，pthread_t 本质上也是一个无符号的整形。所以，相当是：

typedef pthread_t unsigned int;

pthread_t tid;

那么，pthread_t 是不能够表示负数的，例如，使用

printf("%d \n", tid);

这样的执行，输出 tid 的时候，有可能显示的是负数，当 tid 的高位为1时候会的，此时，就显示负数。那么，可以在输出的时候，转换成 unsigned int 输出，例如：

printf("%u \n", tid); //%u 表示把 tid 当作无符号数来操作

就不会显示负数了。

所以，虽然线程ID是一个无符合的整数，可以使用 unsigned int 来定义，但是，为了可移植性的问题，最好还是使用系统提供的 pthread_t 来定义。这样，当修改的时候，只需要修改 #typedef pthread_t unsigned int; 就可以了。

在linux环境下，可以使用 pthread_self(); 函数来得到一个线程的ID号，函数的格式如下：

pthread_t pthread_self(void);

若要判断两个线程的ID是否相等，可以使用 pthread_equal(); 函数，返回值是：

The pthread_equal() function shall return a non-zero value if t1 and t2 are equal; otherwise, zero shall be returned.

If either t1 or t2 are not valid thread IDs, the behavior is undefined.

15.2.1创建线程---pthread_create()

可以使用 pthread_create(); 函数创建一个线程，格式：

int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void*), void *restrict arg);

具体的参数解释可以参考前面的“2.1创建缺省线程---pthread_create() pthread_attr_init()”。

DESCRIPTION

The pthread_create() function shall create a new thread, with attributes specified by attr, within a process. If attr is NULL, the default attributes shall be used. If the attributes specified by attr are modified later, the thread's attributes shall not be affected. Upon successful completion, pthread_create() shall store the ID of the created thread in the location referenced by thread.

The thread is created executing start_routine with arg as its sole argument. If the start_routine returns, the effect shall be as if there was an implicit call to pthread_exit() using the return value of start_routine as the exit status. Note that the thread in which main() was originally invoked differs from this. When it returns from main(), the effect shall be as if there was an implicit call to exit() using the return value of main() as the exit status.

The signal state of the new thread shall be initialized as follows:

1 The signal mask shall be inherited from the creating thread.

2 The set of signals pending for the new thread shall be empty.

The alternate stack shall not be inherited.

The floating-point environment shall be inherited from the creating thread.

If pthread_create() fails, no new thread is created and the contents of the location referenced by thread are undefined.

If _POSIX_THREAD_CPUTIME is defined, the new thread shall have a CPU-time clock accessible, and the initial value of this clock shall be set to zero.

RETURN VALUE

If successful, the pthread_create() function shall return zero; otherwise, an error number shall be returned to indicate the error.

ERRORS

The pthread_create() function shall fail if:

EAGAIN

The system lacked the necessary resources to create another thread, or the system-imposed limit on the total number of threads in a process {PTHREAD_THREADS_MAX} would be exceeded.

EINVAL

The value specified by attr is invalid.

EPERM

The caller does not have appropriate permission to set the required scheduling parameters or scheduling policy.

如下是一个例子：

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <pthread.h>

void* thread_fun(void* var)//定义线程要执行的线程体，当定义线程的时候，线程就从这个线程体开始执行

{

pid_t pid; //进程ID

pthread_t tid; //线程ID

pid = getpid(); //取得当前进程ID

tid = pthread_self(); //取得当前线程ID

printf("the new thread in the process ID is : %u, new thread ID is :%u \n", (unsigned int)pid, (unsigned int)tid);

return NULL;

}

int main(void)

{

pid_t pid;

pthread_t tid, mtid;

int err;

int result;

pid = getpid(); //得到当前进程ID

mtid = pthread_self(); //得到当前线程ID

err = pthread_create(&tid, NULL, thread_fun, NULL);//创建线程，同时把内核分配给的线程号赋给 tid

if(err != 0)//根据创建线程成功返回 0 来判断

{

perror("can't create thread by pthread_create() ..");

exit(1);

}

sleep(1);

printf("the main process ID is :%u , thread ID is : %u \n", (unsigned int)pid, (unsigned int)mtid);

result = pthread_equal(tid, mtid); //判断两个线程ID是否相等，要是相等，返回非0值，不相等则返回0值

printf("the value of result is: %d \n", result);

if(result == 0)

{

printf("the main thread is not equal to new thread. \n");

}

else

{

printf("the main thread is equal to new thread. \n");

}

return EXIT_SUCCESS;

}

15.2.1.1 编译运行

假设上面的代码存放在 test.c 文件中，那么，可以使用如下的编译命令进行编译：

[weikaifeng@weikaifeng test]$ gcc test.c -o test -l pthread

在代码中操作的线程函数，使用了线程库，所以，必须使用 –l 前缀“连接 link”pthread 这个库，这样才能够正常运行。运行的结果如下：

[weikaifeng@weikaifeng test]$ ./test

the new thread in the process ID is : 17495, new thread ID is :3078437744

the main process ID is :17495 , thread ID is : 3078440640

the value of result is: 0

the main thread is not equal to new thread.

可以看到：

1 主线程的ID = 3078440640，它与进程在同样的 main(); 函数中。

2 创建的子线程 ID = 3078437744，它与主线程分开，是另外一个线程。

3 主线程和子线程所在的进程 ID = 17495，是不变的。

15.2.2向线程体函数传递参数---传递多个值

使用 pthread_create(); 创建一个线程如下：

int pthread_create(pthread_t *tid, const pthread_attr_t *tarr, void* (*start_thread)(void*), void* arg);

其中，第四个参数 arg 是一个“万能指针”，它作为第三个参数 start_thread 线程执行体函数的参数，传入到start_thread 函数指针所指向的函数中。

所以，如果想给 start_thread 函数传递大量的数据，可以把这些数据封装到一个结构体中，然后，把这个结构体的地址作为 pthread_create(); 的第四个参数 arg 传入到 start_thread 中即可。

所以，可以给线程的函数执行体传递一个 void* 类型的参数，那么，如果要传递多个值，可以创建一个结构体，然后，将结构体的地址作为参数传递给线程的函数执行体。例如：

#include <pthread.h>

#include <stdio.h>

typedef struct arg_struct ARG;

struct arg_struct

{

char arg1[12];

int arg2;

float arg3;

};

void* thread_fun(void* arg)

{

ARG* data = (ARG*)arg;

printf("arg1 = %s \n", data->arg1);

printf("arg2 = %d \n", data->arg2);

printf("arg3 = %f \n", data->arg3);

}

int main()

{

pthread_t tid;

ARG data;

strcpy(data.arg1, "feng");

data.arg2 = 12;

data.arg3 = 12.10;

pthread_create(&tid, NULL, thread_fun, (void*)&data); //第四个参数就是传给 thread_fun(); 函数的参数

pthread_join(tid, NULL);

printf("exit the main thread \n");

return 1;

}

执行的输出结果是：

arg1 = feng

arg2 = 12

arg3 = 12.100000

exit the main thread

15.2.3线程访问资源的限制---共享主线程的数据段堆栈段

在程序中定义一个全局变量，这个全局变量是定义在程序的数据段中，然后，使用malloc(); 在堆上申请内存空间，定义一个局部变量，是在函数栈上定义数据，最后，子线程和主线程都可以操作这些数据。

例如，是一个例子：

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

typedef struct arg_struct ARG;

struct arg_struct

{

int *heap; /*堆的指针*/

int *stack; /*栈的指针*/

};

int GlobalVar = 6; /*定义全局变量，是在数据段中定义*/

void* thread_fun(void* arg)

{

ARG* data = (ARG*)arg;

(*(data->heap))++; /*操作堆中数据*/

(*(data->stack))++; /*操作栈中数据*/

GlobalVar++; /*操作数据段中数据*/

printf("heap = %d \n", *(data->heap));

printf("stack = %d \n", *(data->stack));

printf("GlobalVar = %d \n", GlobalVar);

}

int main()

{

pthread_t tid;

int stack; /*定义局部变量，就是在函数栈上定义*/

int *heap; /*用来使用 malloc() 申请堆上的内存*/

ARG data;

heap = (int*)malloc(1*sizeof(int));

*heap = 2;

stack = 10;

data.heap = heap;

data.stack = &stack;

pthread_create(&tid, NULL, thread_fun, (void*)&data);

/*下面调用 pthread_join() 来阻塞当前线程，让子线程先执行完，因为，主线程和子线程共同操作这些数据段和堆栈段中的数据，可能会导致同时访问，所以，这里，就先让子线程先执行完。*/

pthread_join(tid, NULL);

(*heap)++;

stack++;

GlobalVar++;

printf("heap = %d \n", *heap);

printf("stack = %d \n", stack);

printf("GlobalVar = %d \n", GlobalVar);

printf("exit the main thread \n");

return 1;

}

运行的结果如下：

[weikaifeng@weikaifeng test]$ gcc test.c -o test -l pthread

[weikaifeng@weikaifeng test]$ ./test

heap = 3

stack = 11

GlobalVar = 7

heap = 4

stack = 12

GlobalVar = 8

exit the main thread

从这个例子可以看出，所有的子线程和主线程，共享“进程”中的数据段和堆栈段。进程的地址空间对于它的任何一个子线程都是开放的。

那么，现在可以解释为什么线程系列处理函数不设置errno变量，而是采用返回出错号的原因了，由于线程可以随意访问进程的环境变量，所以，当多个线程出错的时候，errno变量的值将被多次覆盖，进程检查到只是最后一个出错线程的出错原因。

归根结底，是由于线程出错和检查errno变量这两个操作不是原子操作，因此，线程系列处理函数只返回错误号，不设置errno变量。

15.2.3.1 线程之间的栈是相互独立

对于每一个线程，它们只见拥有独立的“栈”，这就涉及到“可重入的问题”，具体可以参考《Linux - 进程 - 线程（重要）可重入问题》。

15.2.4终止线程---pthread_exit() pthread_join()

线程的退出方式有两种：

1 线程的函数执行体执行完毕，例如，使用 pthread_create(&tid, NULL, thread_fun, NULL);

创建一个线程，那么，当线程的函数执行体 thread_fun 执行完毕的时候，这个子线程就自然地结束了。这种方式，是线程的自然正常销亡。

2 线程被另一个线程所取消，这种方式类似于一个进程被另一个进程调用kill函数销毁。

3 线程自己退出，例如，进程中调用了 exit(); 函数。

可以使用 pthread_exit(); 函数来终止线程，格式如下：

void pthread_exit(void *rval_ptr);

参数 rval_ptr 是一个指针，执行的区域存储退出信息，该信息类似传递给新线程的参数，可以将多个信息组织成一个结构体。

一个线程的结束信息有两种：

1 线程函数执行体返回的指针所指向的区域。

2 是pthread_exit(); 函数所指向的区域。

当一个线程结束运行之后，其结束信息的地在被保存在内核中，其他的线程可以引用此线程的结束信息，可以使用 pthread_join(); 访问指定线程的结束信息。函数定义如下：

int pthread_join(pthread_t tid, void** rval_ptr);

第一个参数表示需要取得结束信息的线程ID，如果该线程还在运行，那么，当前调用该函数的线程就会阻塞，直到，第一个参数指定的线程运行结束。如果第一个参数指定的线程与当前线程不在同一个进程中，那么，pthread_join(); 返回错误。

这一点，就说明了，不同进程之间的线程通信远不像属于同一进程的线程那样简单。事实上其通信方式类似于进程间的通信方式。

第二个参数是一个指向任意类型的二级指针，由于pthread_join()函数负责从内核中得到指定线程结束信息的地址，这是一个指针，因为要在内核中改变他的值，所以该参数的类型为指针的指针。如果线程由于线程体函数返回或者调用pthreat_exit(); 函数退出，则*rval_ptr 指向的是退出信息的首地址，如果线程由于被其他线程取消而退出，则rval_ptr被设置为PTHREAD_CANCELED。

如果调用线程堆指定线程的退出信息并不关心，则可以将第二个参数设置为NULL。如果pthread_join(); 函数调用成功，返回0，否则，返回错误号。

如下是一个例子：

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

void* thread_fun1(void* arg)

{

printf("In the first thread \n");

return (void*)1;

}

void* thread_fun2(void* arg)

{

printf("In the second thread \n");

pthread_exit((void*)2);

printf("should not be execute here \n");

}

void* thread_fun3(void* arg)

{

printf("In the third thread, sleep 2 seconds \n");

sleep(2);

return NULL;

}

int main()

{

pthread_t tid[3];

int i = 0;

void* res; /*指向线程的退出信息指针*/

pthread_create(&tid[0], NULL, thread_fun1, NULL);

pthread_join(tid[0], &res);

printf("result form thread1 is: %d \n", (unsigned int)res);

pthread_create(&tid[1], NULL, thread_fun2, NULL);

pthread_join(tid[1], &res);

printf("result form thread2 is: %d \n", (unsigned int)res);

pthread_create(&tid[2], NULL, thread_fun3, NULL);

pthread_join(tid[2], &res);

printf("result form thread3 is: %d \n", (unsigned int)res);

return 0;

}

运行结果如下：

In the first thread

result form thread1 is: 1

In the second thread

result form thread2 is: 2

In the third thread, sleep 2 seconds

result form thread3 is: 0

在上面的程序中，我们创建了三个线程，但是，每创建一个线程，就执行pthread_join(); 来等待其结束，然后，在创建第二个线程。这样使用线程，是没有能够充分利用多线程的并发，这里，这样编写代码的目的，是为了方便产看pthread_join(); 返回的信号而已。

15.2.5正确得到线程退出信息的方法

在线程结束运行之后，Linux内核中保存的只是存储退出信息内存区域的首地址，而并没有将退出信息实际保存到内核中，因此，在线程结束运行后，其保存退出信息的内存区域仍然是有效的，所以，不能将退出信息存储在局部变量中，而应该使用动态分配的内存或者全局变量。

下面的例子，演示，错误和正确的方式来获取线程退出信息的方法，代码如下：

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

typedef struct test Test;

/*用于测试的结构体*/

struct test

{

int a;

int b;

};

Test var; /*定义一个全局变量*/

/*该线程返回的信息，是由一个结构体构成，那么，当线程结束的时候，这个结构体也就销毁了。

void* thread_fun1(void* arg)

{

printf("In the first thread \n");

Test a;

a.a = 12;

a.b = 16;

return (void*)&a;

}

void* thread_fun2(void* arg)

{

printf("In the second thread \n");

Test *a;

a = (Test*)malloc(sizeof(Test));

a->a = 1;

a->b = 2;

return (void*)a;

}

void* thread_fun3(void* arg)

{

printf("In the third thread \n");

var.a = 9; /*操作全局变量*/

var.b = 9;

return (void*)&var;

}

void* thread_fun4(void* arg)

{

printf("In the fourth thread \n");

Test *a = (Test*)arg;

a->a = 2;

a->b = 3;

return (void*)a;

}

int main()

{

pthread_t tid[4];

int i = 0;

void* res; /*指向线程的退出信息指针*/

pthread_create(&tid[0], NULL, thread_fun1, NULL);

pthread_join(tid[0], &res);

printf("result form thread1 is: a = %d, b = %d \n", ((Test*)res)->a, ((Test*)res)->b);

pthread_create(&tid[1], NULL, thread_fun2, NULL);

pthread_join(tid[1], &res);

printf("result form thread1 is: a = %d, b = %d \n", ((Test*)res)->a, ((Test*)res)->b);

pthread_create(&tid[2], NULL, thread_fun3, NULL);

pthread_join(tid[2], &res);

printf("result form thread1 is: a = %d, b = %d \n", ((Test*)res)->a, ((Test*)res)->b);

pthread_create(&tid[3], NULL, thread_fun4, (void*)&var);

pthread_join(tid[3], &res);

printf("result form thread1 is: a = %d, b = %d \n", ((Test*)res)->a, ((Test*)res)->b);

return 0;

}

运行结果如下：

In the first thread

result form thread1 is: a = -1217082512, b = 4001536

In the second thread

result form thread1 is: a = 1, b = 2

In the third thread

result form thread1 is: a = 9, b = 9

In the fourth thread

result form thread1 is: a = 2, b = 3

可以知道，第一个线程，返回的是局部变量，所以，线程销毁后，是一个无效的值，返回的时候一些随机数。

15.2.6取消一个线程的执行---pthread_cancel()

一个进程可以通过发送信号的方式使另一个进程结束运行。例如，可以使用kill(); 函数，向另一个进程发送SIGKILL信号，同样，一个线程也可以被另一个线程取消掉。

可以使用pthread_cancel() 取消另一个线程。函数格式如下：

int pthread_cancel(pthread_t tid);

参数表示需要取消的线程ID号，如果取消成功，则返回0，失败，返回错误编号。调用pthread_cancel(); 等效于要被取消的线程自己调用：pthread_exit(PTHREAD_CANCELED);

注意，调用 pthread_cancel(); 之后，线程并不会

有如下的例子:

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

void* thread_fun(void* arg)

{

while(1)

printf("in child thread \n");

}

int main()

{

pthread_t tid;

void* res;

int err;

pthread_create(&tid, NULL, thread_fun, NULL);

err = pthread_cancel(tid);

if(0 != err)

{

perror("fail to use pthread_cancel() ");

exit(1);

}

pthread_join(tid, &res);

if(PTHREAD_CANCELED == res)

printf("thread %u has been canceled \n", (unsigned int)tid);

else

printf("error \n");

return 0;

}

运行的结果如下：

in child thread

thread -1216758928 has been canceled

对于线程ID是一个无符合整形，那么，应该输入如下输出线程ID:

printf("thread %u has been canceled \n", (unsigned int)tid);

对取消一个线程，更多的应用，可以参考16.4对线程取消点的讨论。

15.2.7线程退出函数---pthread_cleanup_push() pthread_cleanup_pop()

同进程一样，一个线程在退出的时候，也可以调用用户设置好的函数，这些函数成为线程清理程序，记录在栈中，可以使用 pthread_cleanup_push()函数来注册一个函数，该函数可以在一些情况下被调用，就如同使用single(); 函数来为某些信号注册一个函数一样，当接收到某个信号的时候，就调用这些注册好的信号处理函数。

同理，可以使用 pthread_cleanup_push(); 来注册一个函数，可以使用pthread_cleanup_pop()函数来调用该函数，调用，pthread_cleanup_push(); 注册的函数还有其它的情况，下面会说到，函数的原型如下：

void pthread_cleanup_push(void (*clean_fun)(void*), void* arg);

void pthread_cleanup_pop(int execute);

其中，pthread_cleanup_push(); 的第一个参数是一个函数指针，指向的函数是执行线程退出的时候，调用的函数，那么，可以在这个函数中实现线程的清除工作。但是，有如下的两种线程退出情况，是不能够调用注册函数的函数：

1 调用 pthread_exit(); 函数退出的线程，不能够调用注册的函数。

2 自然终止的线程，不能够调用注册的函数。

线程退出清理函数接收一个参数，这个参数，实际上就是phread_cleanup_push();的第二个参数传递过去数据。

注意：

清理函数的执行顺序与设置正好相反。

pthread_cleanup_pop(); 的参数表示是否执行pthread_cleanup_push();注册的函数，如果参数为0，表示不执行注册函数，但是，将注册函数出栈，就是删除当前线程和这个注册函数的关系。参数非0的时候，表示执行注册函数，执行之后，该注册函数出栈。

pthread_cleanup_push(); 函数会在以下3种情况执行：

1 调用 pthread_exit(); 函数的时候。/*这个方式经过测试，没有成功*/

2 线程被其他线程取消的时候，例如，调用pthread_cancel();

3 使用非零参数调用 pthread_cleanup_pop(); 参数为非0的时候。

注意：这两函数必须成对使用，不然编译就会出错。

例如：

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

void cleanup_fun(void* arg)

{

printf(" NO. %d clean up produce \n", (*(int*)arg));

}

/*下面定义的这个线程是正常死亡，所以，没有执行调用 cleanup_fun() 函数*/

void* thread_fun1(void* arg)

{

int a = 1;

printf("in first thread \n");

pthread_cleanup_push(cleanup_fun, &a);

pthread_cleanup_pop(0); /*参数是 0 不调用 clean_fun()*/

a = 2;

pthread_cleanup_push(cleanup_fun, &a);

pthread_cleanup_pop(1); /*参数不是 0 ，调用 clean_fun() 函数*/

return NULL;

}

void* thread_fun2(void* arg)

{

int a = 11;

printf("in second thread \n");

pthread_cleanup_push(cleanup_fun, &a);

pthread_cleanup_pop(0); /*参数是 0 不调用 clean_fun()*/

pthread_exit(NULL); /*调用 clean_fun() 函数，但是，经过测试，并没有调用*/

a = 12;

pthread_cleanup_push(cleanup_fun, &a);

pthread_cleanup_pop(1); /*参数不是 0 ，调用 clean_fun() 函数*/

return NULL;

}

void* thread_fun3(void* arg)

{

int a = 31;

printf("in third thread \n");

pthread_cleanup_push(cleanup_fun, &a);

a = 32;

pthread_cleanup_push(cleanup_fun, &a);

pthread_cleanup_pop(1); /*是非0参数*/

printf("third thread sleep 3 seconds \n");

sleep(3);

pthread_cleanup_pop(0);

return NULL;

}

int main()

{

pthread_t tid1, tid2, tid3;

pthread_create(&tid1, NULL, thread_fun1, NULL);

pthread_join(tid1, NULL);

pthread_create(&tid2, NULL, thread_fun2, NULL);

pthread_join(tid2, NULL);

pthread_create(&tid3, NULL, thread_fun3, NULL);

pthread_cancel(tid3); /*在主线程中取消第三个线程，所以，tid3 线程调用注册的cleanup_fun(); 函数*/

pthread_join(tid3, NULL);

sleep(1);

}

线程高级操作

线程同步---使用互斥量

线程共享进程空间内的资源，这使得线程之间的通信变得非常容易，但同样是由于共享资源，因此，在多线程并发执行的环境中，就可能出现冲突操作的情况。所以，这里就介绍使用互斥量进行线程之间的同步操作，使得线程的访问共享资源时受到用户的控制，从而正常地完成任务。

16.1.1初始化与销毁互斥量---pthread_mutex_init() pthread_mutex_destory()

互斥量是一种锁，在访问共享资源时对其加锁，在访问结束的时候释放锁，这样可以保证在任意时间内，只有一个线程处于临界区内。

注意：

任何想要进入临界区的线程都要堆锁进行测试，如果该锁已经被某一个线程所持有，那么，测试使用该锁的线程将会被阻塞，直到该锁被释放，线程会重复上述过程。

在线程没有释放锁之前，任何试图使用该锁进入临界区的线程都将会被阻塞，形成一个阻塞线程的队列。

可以使用 pthread_mutex_t 类型来定义一个互斥量，在使用互斥量的时候，需要使用pthread_mutext_init()函数对其进行初始化，函数定义如下：

int pthread_mutex_init(pthread_mutex_t *restrict mutex, const pthread_mutexattr_t *restrict attr);

第一个参数是指向互斥量的指针，互斥量在该函数内被初始化，并通过此参数返回给调用者，第二个参数是互斥量的属性，如果设置为NULL，内核会使用默认属性对互斥量进行初始化，如果初始化成功，pthread_mutex_init(); 返回0，否则返回错误号。

Linux 还提供另一种方式来初始化信号量，可以将互斥信号量设置为 PTHREAD_MUTEX_INITIALIZER ，但是，这种方法具有局限性，就是如果互斥量使用动态分配内存的方法得到，那么，不能够应用该信号量，例如：

pthread_mutex_t *mutex;

mutex = (pthread_mutex_t*)malloc(sizeof(pthread_mutex_t));

mutex = PTHREAD_MUTEX_INITIALIZER;

以上定义的互斥信号量 mutex 是非法的，因为，Linux 将 pthread_mutex_t 类型定义为结构体，而 PTHREAD_MUTEX_INITIALIZER 常量相当于已经设置号的结构体变量中每个成员变量的值。这样，一个已经定义的结构体对象可以使用这种方法，一个使用而 malloc() 函数分配的结构体对象则不能够使用这种方法，所以，应该写成：

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; //只能够在定义变量的时候初始化

这种方法与： pthread_mutex_init(&mutex, NULL); 是等价的。 //所以，以这种方式创建的互斥量，其属性就是PTHREAD_MUTEX_INITIALIZER

当一个互斥信号量不再使用的时候，应该将其销毁，可以使用如下函数：

int pthread_mutex_destroy(pthread_mutex_t *mutex);

如果销毁成功，返回0，否则返回错误号。

16.1.2得到与释放互斥量---pthread_mutex_lock() pthread_mutex_trylock() pthread_mutex_unlock()

互斥量作为一个对于用户来说透明的结构体，用户不可以直接堆其进行操作，而应当使用系统提高的操作互斥量的接口函数，这样做符合编程的封装原则，有利于程序的模块化，Linux提供一组函数操作这些互斥量：

int pthread_mutex_lock(pthread_mutex_t *mutex);

int pthread_mutex_trylock(pthread_mutex_t *mutex);

int pthread_mutex_unlock(pthread_mutex_t *mutex);

前两个函数用于获取一个互斥量的锁，就是对临界区加锁，第三个函数用于释放以互斥量的锁对临界区解锁。

pthread_mutex_lock(); 的参数表示一个互斥量，调用该函数的线程希望得到该互斥量的锁，如果该互斥量的锁已经被某一个线程得到，那么，该函数会导致当前调用该函数的线程阻塞，直到互斥量的锁被释放，如果成功得到该锁，pthread_mutex_lock(); 就返回0，失败返回错误号。

注意：

pthread_mutex_trylock(); 函数的参数和返回值的意义同 pthread_mutex_lock(); 差不多，二者仅有一点不同，就是 pthread_mutex_trylock(); 在得不到指定互斥量的锁时并不会导致调用线程阻塞。而是立即返回一个错误编号EBUSY，表示所申请的锁处于繁忙状态。

以下是调用 pthread_mutex_trylock(); 非阻塞的获取锁来设计一个阻塞（忙的时候等待）的获取锁：

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

while(pthread_mutex_trylock(&mutex) == EBUSY)

可以使用pthread_mutex_unlock(); 函数释放一个互斥量的锁，其参数表示需要释放锁的互斥量，如果成功释放，pthread_mutex_unlock(); 返回0，失败返回错误号。

有如下的例子：

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

pthread_mutex_t mylock; /*定义一个全局变量的信号锁，是让每个线程都可以访问*/

int global = 0; /*定义一个全局变量*/

int i = 0;

void* thread_fun1(void* arg)

{

printf("In the first thread \n");

pthread_mutex_lock(&mylock);

for(i = 0; i < 10; i++)

{

global = i + 1;

printf("in first thread , global = %d \n", global);

usleep(20);

}

pthread_mutex_unlock(&mylock);

return NULL;

}

void* thread_fun2(void* arg)

{

printf("In the second thread \n");

pthread_mutex_lock(&mylock);

for(i = 0; i < 10; i++)

{

global = i + 3;

printf("in second thread , global = %d \n", global);

usleep(20);

}

pthread_mutex_unlock(&mylock);

return NULL;

}

void* thread_fun3(void* arg)

{

printf("In the third thread \n");

pthread_mutex_lock(&mylock);

for(i = 0; i < 10; i++)

{

global = i + 5;

printf("in third thread , global = %d \n", global);

usleep(20);

}

pthread_mutex_unlock(&mylock);

return NULL;

}

int main()

{

pthread_t tid[4];

int i = 0;

pthread_mutex_init(&mylock, NULL); /*在主线中初始化互斥信号量锁*/

pthread_create(&tid[0], NULL, thread_fun1, NULL);

pthread_create(&tid[1], NULL, thread_fun2, NULL);

pthread_create(&tid[2], NULL, thread_fun3, NULL);

/*下面这段代码，如果，在上面 3 个子线程中，如果没有获取到 mylock 互斥信号量，就让主线程获取到互斥信号量的时候，主线程会在调用 pthread_mutex_unlock() 来释放锁的时候，执行 whil(1) 这样，被的子线程，就容易得不到 mylock 锁，而永远阻塞*/

pthread_mutex_lock(&mylock);

while(1);

pthread_mutex_unlock(&mylock);

for(i = 0; i < 3; i++)

{

pthread_join(tid[i], NULL);

}

return 0;

}

运行的结果如下：

[weikaifeng@weikaifeng test]$ ./test

In the second thread

in second thread , global = 3

In the third thread

In the first thread

in second thread , global = 4

in second thread , global = 5

in second thread , global = 6

in second thread , global = 7

in second thread , global = 8

in second thread , global = 9

in second thread , global = 10

in second thread , global = 11

in second thread , global = 12

in third thread , global = 5

in third thread , global = 6

in third thread , global = 7

in third thread , global = 8

in third thread , global = 9

in third thread , global = 10

in third thread , global = 11

in third thread , global = 12

in third thread , global = 13

in third thread , global = 14

in first thread , global = 1

in first thread , global = 2

in first thread , global = 3

in first thread , global = 4

in first thread , global = 5

in first thread , global = 6

in first thread , global = 7

in first thread , global = 8

in first thread , global = 9

in first thread , global = 10

可以看到，同时启动 3 个线程，first 线程先被创建，但是，并不是先启动。这是由系统决定的。可以看出，second 线程先获取到互斥量，执行操作。然后，在third 线程获取到互斥量。最后，到first 线程获取互斥量。

16.1.3线程互斥量的属性

线程可以有如下的4种属性

锁类型	初始化方式	加解锁特征	调度特征
普通锁	PTHREAD_MUTEX_INITIALIZER	同一线程可重复加锁，解锁一次释放锁	先等待锁的进程先获得锁
嵌套锁	PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP	同一线程可重复加锁，解锁同样次数才可释放锁	先等待锁的进程先获得锁
纠错锁	PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP	同一线程不能重复加锁，加上的锁只能由本线程解锁	先等待锁的进程先获得锁
自适应锁	PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP	同一线程可重加锁，解锁一次生效	所有等待锁的线程自由竞争

注意：自己在开发485通讯应用程序的时候，定义了如下的一个互斥量：

pthread_mutex_t mutex_com_485 = PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP;

那么，多个线程在竞争 mutex_com_485 互斥量的时候，就出现了有些线程不能够获取到互斥量。因为，对于PTHREAD_ADAPTIVE_MUTEX_INITIALIZER_NP 是所有等待锁的线程“自由竞争”。所以，有些线程，就有可能永远竞争不到该锁。

线程同步---使用读写锁

读写锁是另一种线程之间同步的操作，如果没有同步操作，会导致线程在执行时出现与时间有关的错误，使用读写锁操作，可以使线程在访问共享资源的时候，受到用户的控制，从而正常地完成任务。

16.2.1初始化与销毁读写锁---pthread_rwlock_init() pthread_rwlock_destory()

读写锁与互斥量类似，但是，使用读写锁，可以让多线程的并行性更高，因为，对于互斥信号量，实现的是每次只有一个献策可以对互斥锁进行拥有，其余的线程都处于阻塞状态，创建多线程的本意就是为了提高并行性，但是，由于互斥信号量导致多线程在同一个时刻只有一个线程拥有互斥信号量，所以，变成了多个线程串行的特性。所以，运行效率会降低。

单色，对于读写锁，就会可以让用户根据线程的类型来使用不同的锁，因为，对于内存的数据，无非就是读和写这两样操作。那么，在执行读操作的时候，可以多个线程同时度，但是，执行写操作时候的，一个时间内，只能有一个线程是写线程。

所以，多个读写线程可以共同拥有一把读写锁，而对于些线程而言，任意时刻只能有一个些线程占用读写锁。

那么，在程序执行的时候，如果对临界区的资源是读操作的多，写操作的少，那么，可以使用读写锁，可以提高线程的并发度。

在使用读写锁之前，要对其进行初始化，可以使用如下函数来初始化：

int pthread_rwlock_init(pthread_rwlock_t *restrict rwlock, const pthread_rwlockattr_t *restrict attr);

第一个参数是读写锁的指针，该锁在该函数内被初始化。第二个参数是读写锁的属性，如果设置为NULL，表示使用默认属性对读写锁进行初始化。如果成功调用返回0，失败返回错误号。当一个读写锁不再使用时候的，要将其销毁，可以使用如下函数：

int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);

成功返回0，失败，返回错误号。

16.2.2得到与释放互斥锁---pthread_rwlock_rdlock() pthread_rwlock_tryrdlock()

Linux中，可以使用如下的一组函数得到”读模式“下的读写锁，其函数格式如下：

int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);

int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);

其中，pthread_rwlock_rdlock(); 函数的第一个参数表示一个读写锁，调用线程希望是在读模式下得到该读写锁，如果，该读写锁已经被某一个线程，也是在读模式下使用该读写锁，那么，调用该函数获取读写锁的线程，仍然能够得到该读写锁，因为，他们都是以只读的模式访问临界区。

如果，此时，有一个线程在写模式下使用该读写锁，那么，调用pthread_rwlock_rdlock(); 的函数会一直阻塞，直到读写锁被释放，如果成功得到读写锁，函数返回0，失败返回错误号。

对于pthread_rwlock_tryrdlock(); 的参数和返回值意义与 pthread_rwlock_rdlock(); 差不多，但是，有一点不同，就是 pthread_rwlock_trylock(); 函数在得不到指定读写锁的时候，该函数并不会导致线程阻塞，而是返回错误编号EBUSY。表示锁申请的锁状态是繁忙状态。

对应读模式，有一个写模式，有一组函数定义如下：

//参考《Linux C 程序设计大全》

16.3条件变量

条件变量是利用线程间共享的全局变量进行同步的一种机制，主要包括两个动作：一个线程等待"条件变量的条件成立"而挂起；另一个线程使"条件成立"（给出条件成立信号）。

就相当是一个线程，等待条件成立而挂起，另一线程，可以使条件成立，那么，等待条件的线程从挂起返回，重新运行。

为了防止竞争，条件变量的使用总是和一个互斥锁结合在一起。

16.3.1创建和注销

条件变量和互斥锁一样，都有静态动态两种创建方式，静态方式使用PTHREAD_COND_INITIALIZER常量，如下：

pthread_cond_t cond = PTHREAD_COND_INITIALIZER

动态方式调用pthread_cond_init()函数，API定义如下：

int pthread_cond_init(pthread_cond_t *cond, pthread_condattr_t *cond_attr)

尽管POSIX标准中为条件变量定义了属性，但在LinuxThreads中没有实现，因此cond_attr值通常为NULL，且被忽略。

注销一个条件变量需要调用pthread_cond_destroy()，只有在没有线程在该条件变量上等待的时候才能注销这个条件变量，否则返回EBUSY。因为Linux实现的条件变量没有分配什么资源，所以注销动作只包括检查是否有等待线程。API定义如下：

int pthread_cond_destroy(pthread_cond_t *cond)

16.3.2等待和激发

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)

int pthread_cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *mutex, const struct timespec *abstime)

等待条件有两种方式：

1 无条件等待pthread_cond_wait()；

2 计时等待pthread_cond_timedwait()；

其中计时等待方式如果在给定时刻前条件没有满足，则返回ETIMEOUT，结束等待，其中abstime以与time()系统调用相同意义的绝对时间形式出现，0表示格林尼治时间1970年1月1日0时0分0秒。

无论哪种等待方式，都必须和一个互斥锁配合，以防止多个线程同时请求pthread_cond_wait()（或pthread_cond_timedwait()，下同）的竞争条件（Race Condition）。

mutex互斥锁必须是普通锁（PTHREAD_MUTEX_TIMED_NP）或者适应锁（PTHREAD_MUTEX_ADAPTIVE_NP）

激发条件有两种形式，pthread_cond_signal()激活一个等待该条件的线程，存在多个等待线程时按入队顺序激活其中一个；而pthread_cond_broadcast()则激活所有等待线程。

16.3.3调用pthread_cond_wait()的正确方式

在调用pthread_cond_wait()前必须由本线程加锁（pthread_mutex_lock()）

假设有如下的的条件变量和互斥锁：

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

对 pthread_cond_wait(); 的调用是 pthread_cond_wait(&cond, &mutex);

那么，在更新条件等待队列以前，mutex保持锁定状态，所以，在调用

pthread_cond_wait(&cond, &mutex);

之前，必须调用：pthread_mutex_lock(&mutex); 给 mutex 互斥锁加锁。所以，有如下的调用顺序：

pthread_mutex_lock(&mutex);

pthread_cond_wait(&cond, &mutex);

那么，线程首先对 mutex 进行加锁，然后，调用 pthread_cond_wait(); 将会执行如下的情况：

1 在线程被挂起，进入等待之前，将mutex解锁。然后，线程就进入睡眠，等待cond条件的满足。

2 在条件cond满足（可以调用pthread_cond_signal() 或者 pthread_cond_broadcast() 来让条件变量cond有效）之后，并且在离开pthread_cond_wait()之前，mutex将被重新加锁，与进入pthread_cond_wait()前的解锁动作对应。

所以，在调用 pthread_cond_wait(); 的时候，会对 mutex 进行解锁，在条件cond满足之后，函数返回之前，重新对 mutex 进行加锁。

所以，如果执行

pthread_mutex_lock(&mutex);

pthread_cond_wait(&cond, &mutex);

pthread_mutex_unlock(&mutex);

如果上面的代码，正确执行完，那么，将会对 mutex 锁进行2个加锁和解锁。

1 第一次加锁是调用 pthread_mutex_lock(&mutex);

2 第一次解锁是调用 pthread_cond_wait(&cond, &mutex); 的时候，该函数对mutex 解锁，然后等待cond条件变量的成立。

3 第二次加锁是调用 pthread_cond_wait(&cond, &mutex); 的时候，在该函数的 cond 条件变量成立的时候，改函数返回，那么，在该函数返回之前，对mutex进行加锁。

4 第二次解锁是调用 pthread_mutex_unlock(&mutex);

16.3.4一个简单例子---解释pthread_cond_signal() 使条件变量有效

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

void* thread_fun(void* arg)

{

printf("in the first child thread \n");

pthread_mutex_lock(&mutex);

printf("child thread wait for condition Enable \n");

pthread_cond_wait(&cond, &mutex);

pthread_mutex_unlock(&mutex);

printf("quit at the first child thread \n");

}

int main()

{

pthread_t tid;

printf("In the main thread \n");

if (pthread_create(&tid, NULL, thread_fun, NULL) != 0) {

exit(1);

}

/*主线程休眠2秒钟，让回子线程，能进入 thread_fun 执行，调用 pthread_cond_wait() 来等待条件变量*/

printf("main thread 2 seconds, after, call pthread_cand_signal() to send conditation Enable \n");

sleep(2);

pthread_cond_signal(&cond); /*让条件变量成立*/

pthread_join(tid, NULL);

pthread_mutex_destroy(&mutex);

pthread_cond_destroy(&cond);

printf("quit at the main thread \n");

return 0;

}

运行结果如下：

In the main thread

main thread 2 seconds, after, call pthread_cand_signal() to send conditation Enable

in the first child thread

child thread wait for condition Enable

quit at the first child thread

quit at the main thread

在上面的例子中，主线程创建一个子线程之后，调用 sleep(2); 休眠，让子线程，能够执行 thread_fun(); 然后，调用 pthread_cond_wait(); 去等待一个条件变量。这样，子线程就挂起等待条件变量。

然后，主线程调用 pthread_cond_signal(); 去触发条件变量有效，这样，子线程等待的cond条件变量就得到满足，然后返回。

注意：其中，子线程在调用 pthread_cond_wait(); 的时候，已经存在对互斥锁mutex的释放和加锁的操作。

16.3.5一个简单例子---解释pthread_cond_wait()的调用存在对锁的释放和加锁

针对pthread_cond_wait(); 有两个参数，第二个参数是一个互斥锁，那么，pthread_cond_wait();对这个锁到底执行了怎么样的操作？下面的例子解释。

#include <pthread.h>

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

int var = 1;

void* thread_fun(void* arg)

{

printf("in the first child thread \n");

pthread_mutex_lock(&mutex);

printf("child thread wait for condition Enable \n");

pthread_cond_wait(&cond, &mutex);

printf("var = %d \n", var);

pthread_mutex_unlock(&mutex);

printf("quit at the first child thread \n");

}

int main()

{

pthread_t tid;

printf("In the main thread \n");

if (pthread_create(&tid, NULL, thread_fun, NULL) != 0) {

exit(1);

}

/*主线程休眠2秒钟，让回子线程，能进入 thread_fun 执行，调用 pthread_cond_wait() 来等待条件变量*/

printf("main thread 2 seconds, after, call pthread_cand_signal() to send conditation Enable \n");

sleep(2);

pthread_mutex_lock(&mutex);

var = 2;

pthread_mutex_unlock(&mutex);

pthread_cond_signal(&cond); /*让条件变量成立*/

pthread_join(tid, NULL);

pthread_mutex_destroy(&mutex);

pthread_cond_destroy(&cond);

printf("quit at the main thread \n");

return 0;

}

在这个程序中，如果我们不知道 pthread_cond_wait();的调用存在，对锁的释放和加锁，那么，对于上面的程序，我们的推测，会进入死锁的状态，有如下的分析：

1 主线程创建子线程

2 主线程执行 sleep(2); 让子线程能够执行 thread_fun(); 函数

3 在主线程执行 sleep(2);休眠的时候，子线程执行 thread_fun();子线程调用 pthread_mutex_lock(); 获取mutex锁，然后，调用 pthread_cond_wait(); 等待cond信号。

/*到这里，假设我们不知道 pthread_cond_wait(); 存在对锁的释放和加锁，那么，继续执行下面的推论*/

4 主线程从sleep(2); 唤醒，然后，调用 pthread_mutex_lock(); 来获取锁，但是，这个锁已经被子线程获取了，而且，没有释放，那么，主线程就阻塞在这里等待子线程释放锁。

5 子线程一直在等待cond信号。而cond条件是主线程，在执行 pthread_mutex_lock(); 之后，才执行 pthread_cond_signal(); 来使cond 条件成立。

这样，子线程在等待 cond 条件，而主线程在等待 mutex 锁，就导致进入了：死锁的状态。

但是，上面的程序在运行的时候，得到正确的结果，输出的var 是2，因为，上面的推论是我们没有把 pthread_cond_wait(); 的执行解释正确。

其实，pthread_cond_wait(); 还实行了对锁的的释放和加锁，不然，它的第二个参数是一个锁有什么作用！！

所以，在上面的推论中，我们忽略了 pthread_cond_wait(); 对第二个参数的释放和加锁。在推论中，当执行到第3步的时候，有：

3 子线程调用 pthread_mutex_lock(); 获取锁之后，调用 pthread_cond_wait(); 此时，子线程，释放第二个参数mutex锁，然后，等待第一个参数cond条件有效。

4 主线程从sleep(2)唤醒，然后，调用 pthread_mutex_lock(); 来获取锁，这样，由于上面第3步的时候，子线程调用pthread_cond_wait(); 释放了mutex 锁，这样，到这里，主线程能够成功得到mutext 锁，然后，给 var 赋值。

最后，主线程执行 pthread_mutex_unlock(); 来释放 mutex 锁。

5 主线程执行 pthread_cond_signal(); 来触发 cond 条件变量有效，这样，子线程等待的cond条件有效，从 pthread_cond_wait(); 返回，在返回之前，重新对 mutex 锁进行加锁。

这样，子线程输出var变量之后，才对 mutex 锁进行解锁。

所以，整个过程这样执行下来，对锁的操作都是完整第获取和释放。

16.3.6一个简单例子---实现生产者和消费这的问题

下面的程序模拟生产者和消费者的问题，其中，主线程作为生产者生产商品，子线程作为消费者消费商品。

在主线程中执行 ptread_create(); 创建一个线程之后，就执行线程的函数thread_func()，同时主线程也继续执行，然后，都一起调用 pthread_mutex_lock(); 来获取锁，那么下面，就分两种情况讨论，分别是主线程和子线程的 pthread_mutex_lock(); 先获取到锁：

（1）主线程的 pthread_mutex_lock(); 先获取到锁，那么，有如下的执行步骤：

1 主线程的 pthread_mutex_lock(); 获取到锁

2 子线程的 pthread_mutex_lock(); 获取锁失败

3 主线程操作临界资源，执行 pthread_cond_signal(); 让 cond 条件有效，然后执行 pthread_mutex_unlock(); 锁。

4 子线程等待锁，得到锁，然后，判断 while(head == NULL)，此时，链表中有数据，所以 head != NULL，直接进入下面，释放队列数据，此时，主线程执行 pthread_mutex_lock(); 获取锁，但是，现在子线程使用该锁，所以，主线程阻塞。

5 子线程执行 pthread_mutex_unlock(); 释放锁。然后，执行whil(1)循环，调用 pthread_mutex_lock(); 获取锁，并判断 head == NULL，然后，就执行 pthread_cond_wait(); 等待条件变量，并释放放锁。

6 主线程等待锁，调用 pthread_mutex_lock(); 获取到锁，然后，操作临界资源。

7 主线程调用 pthread_cond_signal(); 使条件有效，并释放锁，这样，子线程从 pthread_cond_wait(); 返回，并重新加锁。子线程操作数据，然后释放锁。

8 子线程在循环只步骤5

（2）子线程的 pthread_mutex_lock(); 先获取到锁，那么，有如下的执行步骤：

1 子线程调用 pthread_mutex_lock(); 获取锁

2 主线程调用 pthread_mutex_lock(); 获取锁失败

3 子线程执行 whil(head = = NULL) 判断，因为，开始是子线程执行先得到锁，主线程没有生产到数据，所以 hread = = NULL 成立

4 子线程调用 pthread_cond_wait(); 释放锁，然后，等待条件变量成立。

5 主线程得到子线程调用 pthread_cond_wiat(); 释放的锁，然后，就从 pthread_mutex_lock(); 成功获取到锁，返回，这样，主线程就生产数据，生产完之后，调用 pthread_cond_signal(); 来使条件变量有效。然后，释放锁。

6 这样，子线程调用 pthread_cond_wait(); 等待的条件变量成立，重新对锁加锁，然后返回。执行下来，对操作主线程产生的数据。最后，释放锁。

7 此时，由进入子线程和主线程争取锁的情况，反正，无论那个线程得到锁，都可以协调执行生产数据和使用数据。

如下是具体的代码：

#include <pthread.h>

#include <unistd.h>

pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

struct node {

int n_number;

struct node *n_next;

}*head = NULL;

static void cleanup_handler(void *arg)

{

printf("Cleanup handler of second thread \n");

free(arg);

(void)pthread_mutex_unlock(&mtx);

}

static void *thread_func(void *arg)

{

struct node *p = NULL;

printf("in the child thread \n");

pthread_cleanup_push(cleanup_handler, p);

while(1){

pthread_mutex_lock(&mtx); //这个mutex主要是用来保证pthread_cond_wait的并发性

while (head == NULL){

//这个while要特别说明一下，单个pthread_cond_wait功能很完善，为何这里要有一个while (head == NULL)呢？因为pthread_cond_wait里的线程可能会被意外唤醒，如果这个时候head != NULL，则不是我们想要的情况。这个时候，应该让线程继续进入pthread_cond_wait

printf("in child thread the call pthread_cond_wait() wait for condition \n");

pthread_cond_wait(&cond, &mtx);

// pthread_cond_wait会先解除之前的pthread_mutex_lock锁定的mtx，然后阻塞在等待对列里休眠，直到再次被唤醒（大多数情况下是等待的条件成立而被唤醒，唤醒后，该进程会先锁定先pthread_mutex_lock(&mtx);，再读取资源 //用这个流程是比较清楚的/*block-->unlock-->wait() return-->lock*/

}

p = head;

head = head->n_next;

printf("Got %d from front of queue\n", p->n_number);

free(p);

pthread_mutex_unlock(&mtx); //临界区数据操作完毕，释放互斥锁

}

pthread_cleanup_pop(0);

return 0;

}

int main(void)

{

pthread_t tid;

int i;

struct node *p;

pthread_create(&tid, NULL, thread_func, NULL); //子线程会一直等待资源，类似生产者和消费者，但是这里的消费者可以是多个消费者，而不仅仅支持普通的单个消费者，这个模型虽然简单，但是很强大

printf("In the main thread, and go to create list \n");

for (i = 0; i < 3; i++) {

pthread_mutex_lock(&mtx); //需要操作head这个临界资源，先加锁

p = malloc(sizeof(struct node));

p->n_number = i;

p->n_next = head;

head = p;

printf("main thread create %d node \n", i);

pthread_cond_signal(&cond);

pthread_mutex_unlock(&mtx); //解锁

sleep(1);

}

printf("main thread call pthread_cancel() to set child thread one cancel point. \n");

pthread_cancel(tid); //关于pthread_cancel，有一点额外的说明，它是从外部终止子线程，子线程会在最近的取消点，退出线程，而在我们的代码里，最近的取消点肯定就是pthread_cond_wait()了。关于取消点的信息，有兴趣可以google,这里不多说了

pthread_join(tid, NULL);

printf("All done -- exiting \n");

return 0;

}

运行的结果如下：

In the main thread, and go to create list

main thread create 0 node

in the child thread

Got 0 from front of queue

in child thread the call pthread_cond_wait() wait for condition

main thread create 1 node

Got 1 from front of queue

in child thread the call pthread_cond_wait() wait for condition

main thread create 2 node

Got 2 from front of queue

in child thread the call pthread_cond_wait() wait for condition

main thread call pthread_cancel() to set child thread one cancel point.

Cleanup handler of second thread

All done -- exiting

16.3.7条件变量的重要作用---指定某个线程得到互斥信号量

我们知道，条件变量可以使用 pthread_cond_wait(); 进行阻塞，然后，由 pthread_cond_signal(); 唤醒这些条件变量。使 pthread_cond_wait(); 返回，典型的例子有：

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

pthread_cond_wait(&cond, &mutex);

其中，pthread_cond_wait(); 不受到 mutex 的任何限制，它只是在开始等待cond条件的时候，把 mutex 锁打开，在pthread_cond_wait();得到cond条件满足，返回的时候，把mutex锁关闭。

可以使用pthread_cond_signal()激活一个等待该条件的线程，存在多个等待线程时按入队顺序激活其中一个；而pthread_cond_broadcast()则激活所有等待线程。

那么，讨论到这里，我们就要知道，使用 pthread_cond_wait(); 来结合互斥信号量来操作条件变量有什么作用？

作用就是：当多个线程在等待同一个mutex的时候，如果，让 mutex 突然可用，那么，多个线程同时等待mutex，最后，就导致不确定是那一个线程可以得到 mutex 信号。

那么，可以当多个线程调用 pthread_cond_wait(); 阻塞的时候，可以调用 pthread_cond_signal(); 来唤醒其中的一个线程。

例如的例子：

pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

pthread_cond_t cond1 = PTHREAD_COND_INITIALIZER;

pthread_cond_t cond2 = PTHREAD_COND_INITIALIZER;

void* start_thread1(void* arg)

{

/*线程1先执行，所以，它先获取锁 mutex ，然后，执行 pthread_cond_wait(); 在返回之前，释放该锁，同时，等待 cond1 条件有效，其实，这里等待的就是线程3 设置改条件变量有效*/

printf("in thread 1 \n");

pthread_mutex_lock(&mutex);

printf("thread 1 is wait cond1\n");

pthread_cond_wait(&cond1, &mutex);

pthread_mutex_unlock(&mutex);

printf("thread 1 sleep 3s \n");

sleep(3);

printf("thread 1 make cond2 is enable \n");

/*执行到这里，线程1释放了锁，而且，线程3也结束了，那么，线程2休眠2秒钟之后，就可以获取到锁，但是，此时，线程2执行 pthread_cond_wait(); 等cond2条件，那么，下面，就执行pthread_cond_signal(); 来触发 cond2 有效。让线程3可以顺序执行完。*/

pthread_cond_signal(&cond2);

printf("thread 1 exit \n");

}

void* start_thread2(void* arg)

{

sleep(3); /*线程2休眠3秒钟，让线程1和线程3先执行*/

printf("in thread 2 \n");

pthread_mutex_lock(&mutex);

printf("thread 2 is wait cond2\n");

pthread_cond_wait(&cond2, &mutex);

pthread_mutex_unlock(&mutex);

printf("thread 2 exit \n");

}

void* start_thread3(void* arg)

{

printf("thread 3 sleep 3s, let thread 1 run! \n");

sleep(3); /*线程3休眠1秒钟，让线程1先执行*/

/*线程3休眠一秒钟之后，在线程1执行 pthread_cond_wait(); 的时候，释放 mutex锁，那么，在这里，线程3得到该锁，其实，线程2也在等待该锁，但是，线程2要休眠2秒钟，所以，在这里，是线程3先得到。

线程3得到锁之后，让 cond1 信号有效，同时释放mutex锁。

此时，cond1 信号有效，就是触发信号2的 pthread_cond_wait(); 返回，线程2从该函数返回的时候，会对mutex加锁，因为，进入该函数的时候，已经对改锁进行过一次解锁。

其实，讨论到这里，整个程序有一个不规则的地方，就是：

在thread3让 cond1 有效之后，执行 pthread_mutex_unlock(); 让 mutex 锁可以使用。

那么，此时，有可能线程2执行 sleep(2); 完成，那么，线程2等待mutex这个锁。

同时，线程1中的 pthread_cond_wait(); 在返回之前是要获取 mutex 锁，然后，对其进行锁上。

所以，在这个时间段，是有线程1和线程2在等待mutex这同一个锁。

但是，它们等待的方式不同，线程1是在 pthread_cond_wait(); 返回之前，需要获取该锁，然后加锁，而线程2是调用 pthread_mutex_lock(); 来获取锁。

如果，Linux 系统内核，已经规定，线程1获取锁的方式比线程2高级，那么，线程1先得到锁，整个程序的运行结果就如自己的推论一样。

printf("in thread 3 \n");

pthread_mutex_lock(&mutex);

printf("thread 3 sleep 3s \n");

sleep(3);

printf("the thread 3 make the cond1 is enable \n");

pthread_cond_signal(&cond1);

pthread_mutex_unlock(&mutex);

printf("thread 3 exit \n");

}

int main(void)

{

pthread_t tid1, tid2, tid3; /*The thread ID*/

printf("In the main thread \n");

pthread_create(&tid1, NULL, start_thread1, NULL);

pthread_create(&tid2, NULL, start_thread2, NULL);

pthread_create(&tid3, NULL, start_thread3, NULL);

pthread_join(tid1, NULL);

pthread_join(tid2, NULL);

pthread_join(tid3, NULL);

return 0;

}

运行结果如下：

[weikaifeng@weikaifeng test]$ ./test

In the main thread

thread 3 sleep 3s, let thread 1 run!

in thread 1

thread 1 is wait cond1

in thread 2

thread 2 is wait cond2

in thread 3

thread 3 sleep 3s

the thread 3 make the cond1 is enable

thread 3 exit

thread 1 sleep 3s

thread 1 make cond2 is enable

thread 1 exit

thread 2 exit

可以看到，开始的时候，thread 1 等待 cond1，thread 2 等待 cond2。然后，thread 3 触发 cond1，让 thread 1 运行，然后，thread 1 再触发 cond2 让 thread2 运行。

就是说，thread 3 控制 thread 1，thread 1 控制 thread 2，最终，从线程退出的顺序可以看出。

16.4POSIX线程取消点---引出pthread_testcancel()的作用

我们知道，当主线程退出的时候，它的子线程也跟着自动终止。

同时，我们也可以使用 pthread_cancel(); 等函数来终止一个线程，但是，该方法只是发送一个Cancel信号给目标线程，希望终止目标线程。

至于，是否终止目标线程，是有目标线程自己决定的。

例如，线程A执行pthread_cancel();来终止线程B，那么，产生的操作只是：

线程B接收到一个Cancel信号，那么，至于线程B要怎么样终止，还有是由线程B自己来决定。线程B可以忽略该信号，或者立即终止自己这个线程，或者运行到一个“取消点”(Cancelation-point)，下面介绍的这些函数就相当是一个取消点，如果，在线程B中调用了这些函数，就相当是在该函数调用的地方设置一个取消点。

所以，如果当前线程B接受处理Cancel信号，那么，线程B在接收到一个Cancel信号之后，没有马上终止，可以运行到取消点才终止。

如果，线程B不处理Cancel信号，可以调用pthread_setcancelstate(); 来设置线程B可以处理Cancel信号。

根据POSIX标准，pthread_join()、pthread_testcancel()、pthread_cond_wait()、pthread_cond_timedwait()、sem_wait()、sigwait()等函数以及read()、write()等会引起阻塞的系统调用都是Cancelation-point，而其他pthread函数都不会引起Cancelation动作。

但是pthread_cancel的手册页声称，由于LinuxThread库与C库结合得不好，因而目前C库函数都不是Cancelation-point。

所以，例如C库中的 read(); 不是一个取消点函数。但是，我们需要线程在接收到Cancel消息之后，并没有马上终止，而是继续运行到 read(); 调用处，上面说了，C库中的 read(); 并不是一个取消点，那么，可以使用 pthread_testcancel(); 来设置一个取消点。

所以，有如下的代码：

pthread_testcancel();

retcode = read(fd, buffer, length);

pthread_testcancel();

这样，就可以把 read(); 的调用之处，封装成一个取消点。

取消操作允许线程请求终止其所在进程中的任何其他线程。不希望或不需要对一组相关的线程执行进一步操作时，可以选择执行取消操作。

我们知道，可以使用pthread_cancel(); 来取消一个进程，但是，系统要求，不是什么情况下，都可以取消线程，只有，当可以安全取消的时候，才可以取消线程，那么，怎么样的时候，才是可以安全取消一个线程？

这就引出了“取消点”的问题。

pthreads标准指定了几个取消点，其中包括：

1 通过pthread_testcancel调用以编程方式建立线程取消点。

2 线程等待pthread_cond_wait或pthread_cond_timewait()中的特定条件。

3 被sigwait(2)阻塞的函数

4一些标准的库调用。通常，这些调用包括线程可基于阻塞的函数，例如 printf(); 输出一个数据，如果，现在一个线程总是执行 printf(); 输出数据，那么，系统认为这个线程是可以正常执行，所以，在printf(); 处可以是一个取消点，下面就会有一个例子来证明这个问题，就是说，在主线程中执行 pthread_cancel(); 来取消一个子线程，子线程执行 while(1); 死循环，那么，主线程是不能够取消子线程，因为，主线程执行 pthread_cancel(); 只是要求系统取消子线程的执行，但是，最后是否要取消子线程，还要看系统检测当前取消这个子线程释放正常，所以，子线程一直执行 while(1); 在其中，没有任何一个取消点，那么，系统认为取消这个子线程是不安全的。所以，这样，主线程不能够取消子线程。但是，把子线程的while(1); 改为 while(1) printf(“hehe ..\n”); 那么，主线程执行 pthread_cancel(); 来取消子线程的时候，就能够正常取消子线程。

因为，子线程一直执行 printf(“hehe..\n”); 来输出数据，那么，系统认为，这样的子线程是可以取消的，所以，就可以取消改子线程。

通过上面的讨论，就知道了取消点的意义。而且，也清楚：执行 pthread_cancel(); 并不会马上取消指定线程的执行，线程是否被取消，是由系统来检测要取消的线程是否有取消点，取消它释放安全。

缺省情况下，将启用取消功能。有时，您可能希望应用程序禁用取消功能。如果禁用取消功能，则会导致延迟所有的取消请求，直到再次启用取消请求。根据POSIX标准，pthread_join()、pthread_testcancel()、pthread_cond_wait()、pthread_cond_timedwait()、sem_wait()、sigwait()等函数以及read()、write()等会引起阻塞的系统调用都是Cancelation-point，所以，在线程中调用这些函数的时候，如果该线程被别的线程执行 pthread_cancel(); 那么，改线程能够被取消。

而其他pthread函数都不会引起Cancelation动作。但是pthread_cancel的手册页声称，由于LinuxThread库与C库结合得不好，因而目前C库函数都不是Cancelation-point；但CANCEL信号会使线程从阻塞的系统调用中退出，并置EINTR错误码，因此可以在需要作为Cancelation-point的系统调用前后调用pthread_testcancel()，从而达到POSIX标准所要求的目标，即如下代码段：

pthread_testcancel();

retcode = read(fd, buffer, length);

pthread_testcancel();

注意：程序设计方面的考虑。

如果线程处于无限循环中，且循环体内没有执行至取消点的必然路径，则线程无法由外部其他线程的取消请求而终止。因此在这样的循环体的必经路径上应该加入pthread_testcancel()调用。

16.4.1一个例子---在while(1) 中设置取消点

#include <pthread.h>

#include <stdio.h>

#include <unistd.h>

void* thr(void* arg)

{

pthread_setcancelstate(PTHREAD_CANCEL_ENABLE,NULL);

pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED,NULL);

while(1)

{

//printf("execute while(1) \n");

//pthread_testcancel();

;

}

printf("thread is not running\n");

}

int main()

{

pthread_t th1;

int err;

printf("In the main thread \n");

err = pthread_create(&th1,NULL,thr,NULL);

sleep(2);

pthread_cancel(th1);

pthread_join(th1,NULL);

printf("Main thread is exit\n");

return 0;

}

上面的代码，如果在子线程的执行体中的 while(1) 中屏蔽掉这些取消点，那么，主线程将不能够执行取消子线程。

如果，子线程的执行体如下：

void* thr(void* arg)

{

Printf(“in child thread \n”);

pthread_setcancelstate(PTHREAD_CANCEL_ENABLE,NULL);

pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED,NULL);

while(1)

{

;

}

printf("thread is not running\n");

}

那么，主线程也能够正常取消子线程，这是，因为，子线程调用了Printf(“in child thread \n”); 也相当是一个取消点。

16.4.2放置取消点

执行取消操作存在一定的危险。大多数危险都与完全恢复不变量和释放共享资源有关。取消线程时一定要格外小心，否则可能会使互斥保留为锁定状态，从而导致死锁状态。或者，已取消的线程可能保留已分配的内存区域，但是系统无法识别这一部分内存，从而无法释放它。

标准C库指定了一个取消接口用于以编程方式允许或禁止取消功能。该库定义的取消点是一组可能会执行取消操作的点。该库还允许定义取消处理程序的范围，以确保这些处理程序在预期的时间和位置运行。取消处理程序提供的清理服务可以将资源和状态恢复到与起点一致的状态。

必须对应用程序有一定的了解，才能放置取消点并执行取消处理程序。互斥肯定不是取消点，只应当在必要时使之保留尽可能短的时间。请将异步取消区域限制在没有外部依赖性的序列，因为外部依赖性可能会产生挂起的资源或未解决的状态条件。在从某个备用的嵌套取消状态返回时，一定要小心地恢复取消状态。该接口提供便于进行恢复的功能：pthread_setcancelstate(3C) 在所引用的变量中保留当前的取消状态，pthread_setcanceltype(3C) 以同样的方式保留当前的取消类型。

在以下三种情况下可能会执行取消操作：

1 异步

2 执行序列中按照标准定义的点

3 调用pthread_cancel()

16.4.3取消线程操作函数-pthread_setcancelstate()

int pthread_cancel(pthread_t thread);

成功之后返回0。失败返回错误号，错误号说明如下：

ESRCH：没有找到线程ID相对应的线程。

int pthread_setcancelstate(int state, int *oldstate);设置本线程对信号的反应

状态有两种：

PTHREAD_CANCEL_ENABLE 默认，收到cancel信号马上设置退出状态。

PTHREAD_CANCEL_DISABLE 收到cancel信号继续运行。

成功之后返回0。失败返回错误号，错误号说明如下：

EINVAL：状态不是PTHREAD_CANCEL_ENABLE或者PTHREAD_CANCEL_DISABLE

int pthread_setcanceltype(int type, int *oldtype);

类型有两种，只有在PTHREAD_CANCEL_ENABLE状态下有效

PTHREAD_CANCEL_ASYNCHRONOUS 立即执行取消信号

PTHREAD_CANCEL_DEFERRED 运行到下一个取消点

成功之后返回0.失败返回错误号，错误号说明如下：

EINVAL：状态不是PTHREAD_CANCEL_ASYNCHRONOUS或者PTHREAD_CANCEL_DEFERRED

void pthread_testcancel(void);

当线程取消功能处于启用状态且取消状态设置为延迟状态时，pthread_testcancel()函数有效。如果在取消功能处处于禁用状态下调用pthread_testcancel()，则该函数不起作用。

请务必仅在线程取消线程操作安全的序列中插入pthread_testcancel()。除通过pthread_testcancel()调用以编程方式建立的取消点意外，pthread标准还指定了几个取消点。

测试退出点,就是测试cancel信号

16.4.4取消函数 pthrad_cancel() 更详细描述

取消是一种让一个线程可以结束其它线程的机制。更好的是，一个线程可以对另一个线程发送一个结束的请求。依据设置的不同，目标线程可能会置之不理，可能会立即终止也可能会将它推迟到下一个取消点。

当一个线程最终尊重了取消的请求，它的行为就像执行了

pthread_exit(PTHREAD_CANCEL)：所有的清理函数句柄以想反的次序被调用，线程终止函数被调用，最终结束线程的执行，并且返回PTHREAD_CANCEL。

所以，执行 pthread_cancel(); 并不会导致线程的马上停止，还要看系统对改信号怎么样处理。这也就解释上上面“取消点”定义的意义。

16.4.5一个执行pthread_cancel(); 从pthread_cond_wait() 退出导致死锁

如下是网络上的一个例子，其中，调用 pthread_cancel(); 导致死锁，代码如下：

1 #include <pthread.h>

3 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

4 pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

6 void* thread0(void* arg)

7 {

8 pthread_mutex_lock(&mutex);

9 pthread_cond_wait(&cond, &mutex);

10 pthread_mutex_unlock(&mutex);

11 pthread_exit(NULL);

12 }

14 void* thread1(void* arg)

15 {

16 sleep(10);

17 pthread_mutex_lock(&mutex);

18 pthread_cond_broadcast(&cond);

19 pthread_mutex_unlock(&mutex);

20 pthread_exit(NULL);

21 }

22 int main()

23 {

24 pthread_t tid[2];

25 if (pthread_create(&tid[0], NULL, &thread0, NULL) != 0) {

26 exit(1);

27 }

28 if (pthread_create(&tid[1], NULL, &thread1, NULL) != 0) {

29 exit(1);

30 }

31 sleep(5);

32 pthread_cancel(tid[0]);

34 pthread_join(tid[0], NULL);

35 pthread_join(tid[1], NULL);

37 pthread_mutex_destroy(&mutex);

38 pthread_cond_destroy(&cond);

39 return 0;

40 }

其实，就是主线程调用 pthread_cancel(tid[0]); 取消一个线程tid[0]的时候，tid[0]线程在pthread_cond_wait(); 哪里死亡，那么，根据pthread_cond_wait();的调用特性，在等待条件变量之前pthread_cond_wait(); 对第二个参数锁进行解锁，在pthread_cond_wait(); 等待的条件变量得到满足之后，返回之前，对第二个参数加锁，对应上之前对该锁的解锁。

所以，问题就是在这里：当使用 pthread_cancel(); 取消线程tid[0]的时候，线程tid[0]就在pthread_cond_wait(); 返回后死亡了，那么，pthread_cond_wait(); 返回后内部对mutex 加锁。而且，此时，线程tid[0] 死亡，就没有能够执行下面的 pthread_mutex_unlock(); 来解锁，这样，就导致了线程tid[1] 得不到锁，就是产生了死锁。

那么解决的方法就是，在线程tid[0]结束的时候，能够调用 pthread_mutex_unlock(); 来释放锁，让线程tid[1]得到该锁。

所以，可以为线程tid[0]注册线程退出的执行函数，对线程tid[0]得修改如下：

void cleanup(void *arg)

{

pthread_mutex_unlock(&mutex);

}

void* thread0(void* arg)

{

pthread_cleanup_push(cleanup, NULL); // thread cleanup handler

pthread_mutex_lock(&mutex);

pthread_cond_wait(&cond, &mutex);

pthread_mutex_unlock(&mutex);

pthread_cleanup_pop(0);

pthread_exit(NULL);

}

16.5信号量

16.5.1信号量与互斥量的联系

信号量和互斥量都作为一个保护临界资源的方式来使用。

对于互斥量，只保证，在同一时间内，只有一个操作人员能够操作临界区资源，所以有：

Pthread_mutex_lock(); //申请访问临界区资源，阻塞，并等待获取权限

….//申请到权限，操作临界区资源

Pthread_mutex_unlock(); //释放权限

可以看到，对于互斥量的使用，只要哪个线程先获取到互斥量权限，那么，那个线程就能够先访问到临界资源，不存在“先后顺序”的问题。例如：

例如，有A和B两个线程，它们共享：

pthread_mutex_t mylock;

然后，都有如下代码：

Pthread_mutex_lock(&mylock); //申请访问临界区资源，阻塞，并等待获取权限

….//申请到权限，操作临界区资源

Pthread_mutex_unlock(&mylock); //释放权限

那么，A, B 这两个线程启动的时候，它们的由CPU进行调度，无法确定那个线程先执行，所以，哪一个线程先获取到互斥量的权限是不确定的。

所以，有可能是A线程先访问临界区，也有可能是线程B先访问临界区。

那么，使用互斥量就不能够满足“生产者”和消费者的问题。

如下分析一种生产者和消费者的关系问题：

1 开始没有消费资料

假设有生产者A和消费者B这两个线程，它们共享一个 Buffer[ ] 内存，在Buffer [ ] 里面存放了消费资料，生产者A往里面写入资料，消费者B从里面读取资料。

开始的时候，Buffer[ ] 里面没有任何消费资料，那么，消费者B不能够访问Buffer[ ] ，只有当生产者A生产了资料之后，把资料写入Buffer[ ] ，这样，消费者B才能去访问 Buffer[ ] ，所以，生产者A和消费者B这两个线程，在访问临界区的时候，就存在了“先后顺序”的问题。

2 有消费资料得先消费

假设有生产者A和消费者B这两个线程，它们共享一个 Buffer[ ] 内存，在Buffer [ ] 里面存放了消费资料，生产者A往里面写入资料，消费者B从里面读取资料。

而且，Buffer[ ] 共享内存的大小是 1 字节。

开始的时候，Buffer[ ] 中已经存放了 1 个字节的资料，那么，生产者A不能够往里面填入数据，不然会溢出。

所以，要求消费者B先把Buffer[ ] 中的资料消费了，然后，生产者A才能够往里面填入数据。

所以，要要求线程B先访问临界区，然后，线程A才可以访问。在访问临界区的时候，就存在了“先后顺序”的问题。

那么，对于存在“先后顺序”访问的问题，可以使用信号量来解决。

信号量就是使用了一个“统计资源可以数量的计数器（假设是 value 变量）”来管理，那么，有：

1 value = N > 0 --- 表示临界区可用的数据量是 N 个，那么，当前，可以有N个操作者去操作临界资源，例如 value == 1; 那么，表示临界区可以让一个操作者来操作。

2 value == 0 --- 表示临界区可以的数量是 0，当前，不允许操作者去访问临界资源。

3 value = -N < 0 --- 表示，当前临界区不可以使用，当前，有 N 个操作者等待使用临界区。

例如，开始的时候，设置 value = 0; 表示，临界区不可以访问。

那么，有：

1 消费者B去访问临界资源的时候，检测到 0 == value; 那么，表示临界区资源不可以使用，就停止等待，获取去处理别的事情。

2 生产者A去生产资料，检测到 0 == value; 表示，临界区是空的，可以存放生产资料，那么，就往临界区中写入生产资料，同时，设置 value = 1;

3 消费者B去访问临界区，检测到 1 == value; 表示，临界区可以使用，然后，进入临界区，同时个，那么执行 value--; 操作，这样，value 的值就是 0，当别的消费者就不能够去抢这个消费资料。

例如，开始的时候，临界区 Buffer[ 5 ] 存放了 5 个可以消费的资料，而且，Buffer[ ] 是满的，那么，可以运行 5 个消费者去消费。

那么，我们就可以设置 value = 5; 表示有 5 个临界资源，允许 5 个消费者去消费。

16.5.2信号量PV操作（转）

P V原语的理论不得不提到的一个人便是赫赫有名的荷兰科学家 E.W.Dijkstra。如果你对这位科学家没有什么印象的话，提起解决图论中最短路径问题的Dijkstra算法应当是我们再熟悉不过的了。

P V原语的概念以及P V操作当中需要使用到的信号量的概念都是由他在1965年提出的。信号量是最早出现的用来解决进程同步与互斥问题的机制(也可实现进程通信)，包括一个称为信号量的变量及对它进行的两个原语操作。

信号量为一个整数，我们设这个信号量为：sem。很显然，我们规定：

1 sem大于等于零的时候代表可供并发进程使用的资源实体数

2 sem小于零的时候，表示正在等待使用临界区的进程的个数。

根据这个原则，在给信号量附初值的时候，我们显然就要设初值大于零。

p操作和v操作是不可中断的程序段，称为原语。P,V原语中P是荷兰语的Passeren，相当于英文的pass, V是荷兰语的Verhoog,相当于英文中的incremnet。且在P,V愿语执行期间不允许有中断的发生。对于具体的实现，方法非常多，可以用硬件实现，也可以用软件实现。这种信号量机制必须有公共内存，不能用于分布式操作系统，这是它最大的弱点。

首先应弄清PV操作的含义：PV操作由P操作原语和V操作原语组成（原语是不可中断的过程），对信号量进行操作，具体定义如下：

P（S）：①将信号量S的值减1，即S=S-1； ②如果S>=0，则该进程继续执行；否则该进程置为等待状态，排入等待队列。

V（S）：①将信号量S的值加1，即S=S+1； ②如果S>0，则该进程继续执行；否则释放队列中第一个等待信号量的进程。

PV操作的意义：我们用信号量及PV操作来实现进程的同步和互斥。PV操作属于进程的低级通信。

什么是信号量？

信号量（semaphore）的数据结构为一个值和一个指针，指针指向等待该信号量的下一个进程。信号量的值与相应资源的使用情况有关。

当它的值大于0时，表示当前可用资源的数量；

当它的值小于0时，其绝对值表示等待使用该资源的进程个数。

注意，信号量的值仅能由PV操作来改变。一般来说，信号量S>=0时，S表示可用资源的数量。执行一次P操作意味着请求分配一个单位资源，因此S的值减1；

当S<0时，表示已经没有可用资源，请求者必须等待别的进程释放该类资源，它才能运行下去。而执行一个V操作意味着释放一个单位资源，因此S的值加1；

若S<=0，表示有某些进程正在等待该资源，因此要唤醒一个等待状态的进程，使之运行下去。

使用PV操作实现进程互斥时应该注意的是：

（1）每个程序中用户实现互斥的P、V操作必须成对出现，先做P操作，进临界区，后做V操作，出临界区。若有多个分支，要认真检查其成对性。

//这样，这里提到的PV 操作，的相对于互斥量的 pthread_mutex_lock() 和 pthread_mutex_unlock(); 的操作，而对于信号量的操作，并不需要 sem_wait() 和 sem_pos() 成对出现。

具体来讲，互斥量其实是信号量中的一种。跟信号量的初始值是1一样。

假设，信号量的初始化值是value == 1，那么，执行 sem_wait() 之后，value = value – 1 = 0;

这样，当别的线程执行 sem_wait(); 的时候，检测到 0 == value; 那么，就阻塞。

只有，执行了 sem_pos(); 之后，value = value + 1 = 1;

这样，别的线程执行 sem_wait(); 才会被唤醒。

这就如同互斥量一样，开始的时候，执行 pthread_mutex_lock(); 可以执行，当没有执行 pthead_mutex_unlock(); 来释放信号量的时候，调用 pthread_mutex_lock(); 会被阻塞。就如同此时，互斥量内部的value == 0;

只有执行了 pthead_mutex_unlock(); 释放了信号量之后，互斥量内部的 value == 1; 此时，调用pthread_mutex_lock(); 才能够被正确执行。

（2）P、V操作应分别紧靠临界区的头尾部，临界区的代码应尽可能短，不能有死循环。

（3）互斥信号量的初值一般为1。利用信号量和PV操作实现进程同步 PV操作是典型的同步机制之一。用一个信号量与一个消息联系起来，当信号量的值为0时，表示期望的消息尚未产生；当信号量的值非0时，表示期望的消息已经存在。用PV操作实现进程同步时，调用P操作测试消息是否到达，调用V操作发送消息。

使用PV操作实现进程同步时应该注意的是：

（1）分析进程间的制约关系，确定信号量种类。在保持进程间有正确的同步关系情况下，哪个进程先执行，哪些进程后执行，彼此间通过什么资源（信号量）进行协调，从而明确要设置哪些信号量。

（2）信号量的初值与相应资源的数量有关，也与P、V操作在程序代码中出现的位置有关。（3）同一信号量的P、V操作要成对出现，但它们分别在不同的进程代码中。

16.5.3信号量PV操作（转）--- 生产者和消费者使用两个信号量

【例1】生产者-消费者问题在多道程序环境下，进程同步是一个十分重要又令人感兴趣的问题，而生产者-消费者问题是其中一个有代表性的进程同步问题。下面我们给出了各种情况下的生产者-消费者问题，深入地分析和透彻地理解这个例子，对于全面解决操作系统内的同步、互斥问题将有很大帮助。

（1）一个生产者，一个消费者，公用一个缓冲区。定义两个同步信号量：

empty——表示缓冲区是否为空，初值为1。

full——表示缓冲区中是否为满，初值为0。

生产者进程

while(TRUE)

{

生产一个产品;

P(empty);

产品送往Buffer;

V(full);

}

消费者进程

while(TRUE)

{

P(full);

从Buffer取出一个产品;

V(empty);

消费该产品;

}

（2）一个生产者，一个消费者，公用n个环形缓冲区。

定义两个同步信号量：

empty——表示缓冲区是否为空，初值为n。

full——表示缓冲区中是否为满，初值为0。

设缓冲区的编号为1～n&61485;1，定义两个指针in和out，分别是生产者进程和消费者进程使用的指针，指向下一个可用的缓冲区。

生产者进程

while(TRUE)

{

生产一个产品;

P(empty);

产品送往buffer(in);

in=(in+1)mod n;

V(full);

}

消费者进程

while(TRUE)

{

P(full);

从buffer(out)中取出产品;

out=(out+1)mod n;

V(empty);

消费该产品;

}

16.5.4信号量PV操作（转）--- 同时使用信号量和互斥量

（3）一组生产者，一组消费者，公用n个环形缓冲区在这个问题中，不仅生产者与消费

者之间要同步，而且各个生产者之间、各个消费者之间还必须互斥地访问缓冲区。

定义四个信号量：

empty——表示缓冲区是否为空，初值为n。

full——表示缓冲区中是否为满，初值为0。

mutex1——生产者之间的互斥信号量，初值为1。

mutex2——消费者之间的互斥信号量，初值为1。

设缓冲区的编号为1～n&61485;1，定义两个指针in和out，分别是生产者进程和消费者进程使用的指针，指向下一个可用的缓冲区。

生产者进程

while(TRUE)

{

生产一个产品;

P(empty);

P(mutex1);

产品送往buffer(in);

in=(in+1)mod n;

V(mutex1);

V(full);

}

消费者进程

while(TRUE)

{

P(full);

P(mutex2);

从buffer(out)中取出产品;

out=(out+1)mod n;

V(mutex2);

V(empty);

消费该产品;

}

需要注意的是无论在生产者进程中还是在消费者进程中，两个P操作的次序不能颠倒。应先执行同步信号量的P操作，然后再执行互斥信号量的P操作，否则可能造成进程死锁。

【例2】桌上有一空盘，允许存放一只水果。爸爸可向盘中放苹果，也可向盘中放桔子，儿子专等吃盘中的桔子，女儿专等吃盘中的苹果。规定当盘空时一次只能放一只水果供吃者取用，请用P、V原语实现爸爸、儿子、女儿三个并发进程的同步。

分析在本题中，爸爸、儿子、女儿共用一个盘子，盘中一次只能放一个水果。当盘子为空时，爸爸可将一个水果放入果盘中。若放入果盘中的是桔子，则允许儿子吃，女儿必须等待；若放入果盘中的是苹果，则允许女儿吃，儿子必须等待。

本题实际上是生产者-消费者问题的一种变形。这里，生产者放入缓冲区的产品有两类，消费者也有两类，每类消费者只消费其中固定的一类产品。

解：在本题中，应设置三个信号量S、So、Sa，

信号量S表示盘子是否为空，其初值为l；

信号量So表示盘中是否有桔子，其初值为0；

信号量Sa表示盘中是否有苹果，其初值为0。

同步描述如下：

int S＝1; //使用盘子的信号量初始值为1，表示，开始的时候，盘子空，可以使用盘子

int Sa＝0; //表示苹果开始的时候没有资源

int So＝0; //表示桔子开始的时候没有资源

main()

{

father(); /*父亲进程*/

son(); /*儿子进程*/

daughter(); /*女儿进程*/

}

father() //父亲进程

{

while(1)

{

P(S); //父亲申请盘子可以的权限，开始的时候，盘子为空，那么，可以马上申请到，此时，盘子被使用了，那么，盘子的可用资源是 0，只有当子女们吃完水果，执行V(S) 释放盘子的时候，父亲执行 P(S) 操作才可以申请到盘子

将水果放入盘中;

if（放入的是桔子）

V(So); //是橘子，就设置橘子可用资源增加1，所以，儿子执行P(So) 就可以申请到句子，执行吃橘子操作

else

V(Sa); //是苹果，就设置苹果可用资源增加1

}

son()

{

while(1)

{

P(So); //申请橘子，申请到之后，消费完橘子，然后，释放盘子

从盘中取出桔子;

V(S); //释放盘子

吃桔子;

}

daughter()

{

while(1)

{

P(Sa); //申请苹果

从盘中取出苹果;

V(S); //释放盘子

吃苹果;

}

思考题：

四个进程A、B、C、D都要读一个共享文件F，系统允许多个进程同时读文件F。但限制是进程A和进程C不能同时读文件F，进程B和进程D也不能同时读文件F。为了使这四个进程并发执行时能按系统要求使用文件，现用PV操作进行管理，请回答下面的问题：

（1）应定义的信号量及初值：。

（2）在下列的程序中填上适当的P、V操作，以保证它们能正确并发工作： A() B() C() D() { { { { [1]; [3]; [5]; [7]; read F; read F; read F; read F; [2]; [4]; [6]; [8]; } } } }

思考题解答：

（1）定义二个信号量S1、S2，初值均为1，即：S1=1，S2=1。其中进程A和C使用信号量S1，进程B和D使用信号量S2。

（2）从[1]到[8]分别为：P(S1) V(S1) P(S2) V(S2) P(S1) V(S1) P(S2) V(S2) 信号量、PV操作是解决进程间的同步与互斥问题的。

★ 做题时尤其要注意隐藏的同步、互斥问题。这些问题通常可以归入生产者－消费者问题和阅读者－写入者问题。

★ PV操作一定是成对出现的，但是这不意味着它会在一个进程内成对出现。

★ 在互斥关系中，PV操作一定是在一个进程内成对出现。而且，信号一定大于0,具体多少视情况而定。而对于同步关系，则一对PV操作在两个进程或者更多的进程中出现。 ★ 对于同步关系，信号量可能为0，也可能不为0；用于同步的信号个数可能1个，也可能是多个。

★ 对信号量为1的，应该先执行V操作。

★ 在生产者－消费者问题中，要设置三个信号量：empty－空闲的缓存区数量，初值为n；full－已填充的缓存区数量，初值为0；mutex－保证只有一个进程在写入缓存区，初值为1。

★ 在阅读者－写入者问题中，设置两个信号量：信号量access－控制写入互斥，初值为1；信号量rc－控制对共享变量ReadCount（读者统计值）的互斥访问。

16.5.5Linux信号量的操作---无名信号量

无名信号量和命名信号量之间的区别类似于普通匿名管道和命名管道之间的差别一样。

信号量的数据类型为结构sem_t，它本质上是一个长整型的数。//我们可以推理出，它就是表示临界资源可用数量的标识，例如 sem_t value = 1; 表示，初始的时候，临界资源可使用的数量是1.

但是，经过下面 sem_init(); 函数的介绍，其实，sem_t 定义的变量不是表示临界资源可以用的数值，而是一个信号量的ID之类。

例如，在创建线程的时候，调用 pthread_create(&thread_id, ……);

这样，第一个参数是作为“指针”传递，是从 pthread_create(); 函数中获取一个值，这个值就是线程的ID。

同样，在 sem_init(); 函数中，第一个参数是 sem_t * 类型的指针，是为了获取信号量的ID。

其实，真正初始化信号量控制的临界资源可用标示是 sem_init(); 的第三个参数。

16.5.5.1 sem_init()

函数sem_init（）用来初始化一个信号量。它的原型为：　　

extern int sem_init __P ((sem_t *__sem, int __pshared, unsigned int __value));　　

sem为指向信号量结构的一个指针；pshared不为0时此信号量在进程间共享，该信号量只能够被当前初始化该信号量的进程中的线程共享；value给出了信号量的初始值。　　

注意：没有规定成功时的返回值，但是，失败返回-1，并设置 errno.如果是：

EINVAL --- 表示 value 大于 SEM_VALUE_MAX

ENOSPC --- 表示初始化资源已经耗尽，或者信号量的数目超出了SEM_NSEMS_MAX的范围。

而且，在创建信号量之后，创建一个子进程，并没有提供对信号量的访问，子进程收到的是信号量的拷贝，而不是真的信号量。

例如：

sem_t sem_test;

if(sem_init(&sem_test, 0, 1) == -1)

{

perror("fail to initialize sem_test ");

}

16.5.5.2 sem_destroy()

可以使用 sem_destroy(); 函数消费一个信号量，格式如下：

int sem_destroy(sem_t *sem);

如下是一个例子：

if(sem_destroy(&sem_test) == -1)

{

perror("fail to destory sem_test");

}

16.5.5.3 sem_wait() sem_trywait()

如下是对信号量的操作，注意：这里描述的信号量操作适用于无名信号量，同时，也适用于命名信号量。

int sem_wait(sem_t *sem);

int sem_trywait(sem_t *sem);

成功返回0，是不返回-1，同时设置 errno.

这两个函数就是申请信号量资源，申请到之后，把信号量的可用资源减一。假设 value 变量记录可用临界资源数，那么，就是执行 value = value – 1;

sem_wait(); 是申请一个信号量资源，假设信号量资源 value == 0; 表示没有资源可以使用，那么，调用 sem_wait(); 就会进入阻塞。等待 value 资源为大于等于1，假设，当 value == 1; 的时候，sem_wait(); 函数申请到访问临界资源的权限，进入访问临界区，同时，执行 value = value -1 == 0; 这样，再次执行 sem_wait(); 的时候，就会阻塞。

sem_trywait(); 是申请一个信号量资源，假设信号量资源 value == 0; 表示没有资源可以使用，那么，调用 Sem_trywait(); 不会进入阻塞。

16.5.5.4 sem_post()

可以使用 sem_post(); 增加一个可用信号量资源，格式如下：

int sem_post(sem_t *sem);

函数的返回值：成功返回0，失败返回-1.

例如，sem 参数管理的可用信号量资源是0，假设是 value == 0; 那么，执行 sem_post(); 后，将增加一个信号量可用资源，相当是 value = value + 1;

16.5.5.5 sem_getvalue()

可以使用 sem_getvalue(); 获取信号量的值，就是标识临界资源可用数的计数值，格式如下：

int sem_getvalue(sem_t *sem, int *get_value);

成功返回0，失败返回-1，同时设置errno.

例如，sem_init(&sem_my, 0, 2); 定义了 sem_my 信号量拥有的临界资源计数值是 2，那么，使用：

int sem_value;

sem_getvalue(sem_my, &sem_value);

那么，第二个参数 sem_value 通过指针，从函数中获取返回数值，得到是数值是 2.

16.5.5.6例子---使用sem_wait 和 sem_pos 同步

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <string.h>

#include <pthread.h>

#include <semaphore.h>

void *thread_functionA(void *arg)

{

sem_t *sem_my = NULL;

sem_my = (sem_t*)arg;

printf("thread_functionA--------------sem_wait\n");

sem_wait(sem_my);

printf("thread_functionA finish \n");

}

void *thread_functionB(void *arg)

{

sem_t *sem_my = NULL;

sem_my = (sem_t*)arg;

printf("thread_functionB--------------sem_post\n");

sem_post(sem_my);

printf("thread_functionB finish\n");

}

int main()

{

sem_t sem_my;

int res;

pthread_t thread_id_A, thread_id_B;

res = sem_init(&sem_my, 0, 0);

if (res != 0)

{

perror("Semaphore initialization failed");

}

printf("sem_init\n");

res = pthread_create(&thread_id_A, NULL, thread_functionA, (void*)&sem_my);

if (res != 0)

{

perror("Thread creation failure");

}

printf("create thread A , and sleep 3s \n");

sleep(3);

printf("create thread B, to call sem_pos() \n");

res = pthread_create(&thread_id_B, NULL, thread_functionB, (void*)&sem_my);

if (res != 0)

{

perror("Thread creation failure");

}

pthread_join(thread_id_A, NULL);

pthread_join(thread_id_B, NULL);

return 0;

}

执行输出结果是：

sem_init

create thread A , and sleep 3s

thread_functionA--------------sem_wait

create thread B, to call sem_pos()

thread_functionB--------------sem_post

thread_functionA finish

thread_functionB finish

上面的的例子还有一个缺陷，就是没有充分地判断各个函数执行的返回值，例如，执行sem_pos() 函数，执行是否成功，是根据函数的返回值进行判断。

同时，使用完信号量之后，最后，应该使用 sem_destory(); 函数来销毁信号量。

16.5.5.7例子---使用sem_getvalue()获取临界资源计数器

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <string.h>

#include <pthread.h>

#include <semaphore.h>

void *thread_functionA(void *arg)

{

int sem_value = 0, ret = 0;

sem_t *sem_my = NULL;

sem_my = (sem_t*)arg;

printf("thread_functionA--------------sem_wait\n");

ret = sem_getvalue(sem_my, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return;

}

printf("before sem_wait() sem_value = %d in threadA \n", sem_value);

sem_wait(sem_my);

printf("thread_functionA finish \n");

ret = sem_getvalue(sem_my, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return;

}

printf("after sem_wait() sem_value = %d in threadA \n", sem_value);

}

void *thread_functionB(void *arg)

{

int sem_value = 0, ret = 0;

sem_t *sem_my = NULL;

sem_my = (sem_t*)arg;

printf("thread_functionB--------------sem_post\n");

ret = sem_getvalue(sem_my, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return;

}

printf("before sem_wait() sem_value = %d in threadB \n", sem_value);

sem_post(sem_my);

printf("thread_functionB finish\n");

ret = sem_getvalue(sem_my, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return;

}

printf("before sem_wait() sem_value = %d in threadB \n", sem_value);

}

int main()

{

sem_t sem_my;

int res;

pthread_t thread_id_A, thread_id_B;

res = sem_init(&sem_my, 0, 2);

if (res != 0)

{

perror("Semaphore initialization failed");

}

printf("sem_init\n");

res = pthread_create(&thread_id_A, NULL, thread_functionA, (void*)&sem_my);

if (res != 0)

{

perror("Thread creation failure");

}

printf("create thread A , and sleep 3s \n");

sleep(3);

res = pthread_create(&thread_id_B, NULL, thread_functionB, (void*)&sem_my);

if (res != 0)

{

perror("Thread creation failure");

}

pthread_join(thread_id_A, NULL);

pthread_join(thread_id_B, NULL);

return 0;

}

运行结果是：

sem_init

create thread A , and sleep 3s

thread_functionA--------------sem_wait

before sem_wait() sem_value = 2 in threadA

thread_functionA finish

after sem_wait() sem_value = 1 in threadA

thread_functionB--------------sem_post

before sem_wait() sem_value = 1 in threadB

thread_functionB finish

before sem_wait() sem_value = 2 in threadB

上面的的例子还有一个缺陷，就是没有充分地判断各个函数执行的返回值，例如，执行sem_pos() 函数，执行是否成功，是根据函数的返回值进行判断。

同时，使用完信号量之后，最后，应该使用 sem_destory(); 函数来销毁信号量。

16.5.6Linux信号量的操作---有名信号量

匿名信号量和命名信号量如同匿名管道和命名管道导致一样。

但是：

1 匿名管道，可以用于父子进程之间的通信

2 命名管道，可以用于没有父子关系进程之间的通信，就是独立进程之间通信

3 匿名信号量，可以用于同一个父进程中，其下面的子线程可共同使用个信号量

4 命名信号量，可以用于不同进程之间的通信，如同命名管道一样，可以是独立进程之间的通信。

命名信号量用来同步那些“不共享”内存的进程。命名信号量和文件一样，有一个名字，一个用户ID，一个组ID和权限。

16.5.6.1sem_open()

可以使用 sem_open(); 创建或打开一个命名信号量，格式如下：

sem_t *sem_open(const char *name, int oflag);

sem_t *sem_open(const char *name, int oflag, mode_t mode, unsigned int value);

参数：

name 信号灯的外部名字

oflag 选择创建或打开一个现有的信号灯

mode 权限位

value 信号灯初始值

返回值：

成功时返回指向信号灯的指针，出错时为SEM_FAILED

oflag参数能是0、O_CREAT（创建一个信号灯）或O_CREAT|O_EXCL（如果没有指定的信号灯就创建），如果指定了O_CREAT，那么第三个和第四个参数是需要的；其中mode参数指定权限位，value参数指定信号灯的初始值，通常用来指定共享资源的书面。该初始不能超过 SEM_VALUE_MAX，这个常值必须低于为32767。二值信号灯的初始值通常为1，计数信号灯的初始值则往往大于1。

如果指定了O_CREAT（而没有指定O_EXCL），那么只有所需的信号灯尚未存在时才初始化他。所需信号灯已存在条件下指定O_CREAT不是个错误（所以，在初始化创建信号量的时候，可以使用 O_CREATE | O_EXCL 标记，就是需要创建信号量，那么，在创建信号量之后，如果要打开信号量，可以只是用 O_CREATE，因为，信号量存在的话，使用 O_CREATE 是不创建信号量的。此时，这个参数，可以使用0 值）。该标志的意思仅仅是“如果所需信号灯尚未存在，那就创建并初始化他”。不过所需信号灯等已存在条件下指定O_CREAT|O_EXCL却是个错误。

sem_open返回指向sem_t信号灯的指针，该结构里记录着当前共享资源的数目。

16.5.6.2sem_close()

关闭信号量

函数原形：

int sem_close(sem_t *sem);

参数：

sem 指向信号灯的指针

返回值：

若成功则返回0，否则返回-1。

一个进程终止时，内核还对其上仍然打开着的所有有名信号灯自动执行这样的信号灯关闭操作。不论该进程是自愿终止的还是非自愿终止的，这种自动关闭都会发生。

但应注意的是关闭一个信号灯并没有将他从系统中删除。这就是说，Posix有名信号灯至少是随内核持续的：即使当前没有进程打开着某个信号灯，他的值仍然保持。

16.5.6.3sem_unlink()

从系统中删除信号灯

函数原形：

int sem_unlink(count char *name);

参数：

name 信号灯的外部名字

返回值：

若成功则返回0，否则返回-1。

有名信号灯使用sem_unlink从系统中删除。

每个信号灯有一个引用计数器记录当前的打开次数，sem_unlink必须等待这个数为0时才能把name所指的信号灯从文件系统中删除（但是，自己的测试，在信号量的计数值是非0的时候，也可以删除该信号量）。也就是要等待最后一个sem_close发生。

Sem_unlink() 调用之后，即使其他的进程仍然将老的信号量打开着，用相同的名字调用的 sem_open() 引用的也是新的信号量，即使其他的进程将信号量打开着，sem_unlink() 函数也总是会立即返回。

注意：命名信号量如同一个文件一样，具有永久保存在系统中的特性，所以，如果一个进程创建了命名信号量，即使该进程退出了，被创建的命名信号量还是存在，就如同一个存在的文件一样。直到调用 sem_unlink(); 把它从系统中销毁。

16.5.6.4例子 --- 创建信号量

#include <stdio.h>

#include <unistd.h>

#include <stdlib.h>

#include <string.h>

#include <pthread.h>

#include <semaphore.h>

#include <fcntl.h>

#include <sys/stat.h>

int main()

{

sem_t *sem_ret = NULL;

int sem_value = 0, ret = 0;

char *sem_name = "wkf";

sem_ret = sem_open(sem_name, O_CREAT|O_EXCL, 0644, 0); //创建信号量，并初始化信号量值为 0

if(SEM_FAILED == sem_ret)

{

perror("fail to use sem_open() ");

return 1;

}

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return;

}

printf("create the sem, value = %d \n", sem_value);

return 0;

}

16.5.6.5例子 --- 等待信号量

int main()

{

sem_t *sem_ret = NULL;

int sem_value = 0, ret = 0;

char *sem_name = "wkf";

//sem_ret = sem_open(sem_name, O_CREAT|O_EXCL, 0644, 1);

//sem_ret = sem_open(sem_name, O_CREAT, 0644, 1);

sem_ret = sem_open(sem_name, 0, 0644, 1);

if(SEM_FAILED == sem_ret)

{

perror("fail to use sem_open() ");

return 1;

}

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return 1;

}

printf("process B before sem_wait, value = %d \n", sem_value);

printf("process B call sem_wait() \n");

ret = sem_wait(sem_ret);

if(-1 == ret)

{

perror("fail use sem_wait()");

return 1;

}

printf("process B waitup \n");

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return 1;

}

printf("process B after sem_wait, value = %d \n", sem_value);

ret = sem_close(sem_ret);

if(-1 == ret)

{

perror("fail to use sem_close() ");

return 1;

}

ret = sem_unlink(sem_name);

if(-1 == ret)

{

perror("fail to use sem_unlink() ");

return 1;

}

return 0;

}

16.5.6.6例子 --- 释放信号量

int main()

{

sem_t *sem_ret = NULL;

int sem_value = 0, ret = 0;

char *sem_name = "wkf";

//sem_ret = sem_open(sem_name, O_CREAT|O_EXCL, 0644, 1);

//sem_ret = sem_open(sem_name, O_CREAT, 0644, 1);

sem_ret = sem_open(sem_name, 0, 0644, 1);

if(SEM_FAILED == sem_ret)

{

perror("fail to use sem_open() ");

return 1;

}

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return 1;

}

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return 1;

}

printf("process A before sleep, value = %d \n", sem_value);

printf("process A sleep(3) \n");

sleep(3);

ret = sem_post(sem_ret);

if(-1 == ret)

{

perror("fail use sem_post()");

return 1;

}

printf("process A call sem_pos() \n");

ret = sem_getvalue(sem_ret, &sem_value);

if(-1 == ret)

{

perror("sem_getvalue() error: ");

return 1;

}

printf("process A after sleep, value = %d \n", sem_value);

return 0;

}

16.5.6.7例子 --- 测试结果

这里是测试是针对上面的例子进行测试，

假设：

1 创建信号量的文件是 sem_init_text.c 文件，编译得到 sem_init_test 可以执行程序。

2 等待信号量的文件是 semB.c 文件，编译得到的可执行文件是 semB

3 释放信号量的文件是 semA.c 文件，编译得到的可执行文件是 semA

那么，首先执行创建信号量得到如下输出：

create the sem, value = 0

然后，执行 ./semB 等待获取信号量，得到如下输出：

process B before sem_wait, value = 0

process B call sem_wait() //此时，信号量的值是 0 ，就阻塞等等

在执行 ./semA 释放信号量，得到如下输出：

process A before sleep, value = 0

process A sleep(3)

process A call sem_pos()

process A after sleep, value = 1 //该进程释放信号，信号量的值为 1

此时，进程 semB 有输出，如下：

process B waitup

process B after sem_wait, value = 0 //semB 进程检测到等待信号量的值是1，就从 sem_wait() 函数中等待结束，并推出，那么，value = value – 1 == 0; 所以，此时，信号量的值是0.

转载地址：https://mylinux.blog.csdn.net/article/details/8995999 如侵犯您的版权，请留言回复原文章的地址，我们会给您删除此文章，给您带来不便请您谅解！

上一篇：poll、ppoll 浅析

下一篇：Linux信号处理机制分析并模拟VC实现多定时器机制

发表评论

关于作者

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！

-- 愿君每日到此一游！

线程---参考《Linux C程序设计大全》的线程部分

线程与进程

15.1.1线程的概念

15.1.2线程的优势

线程的标识符

15.2.1创建线程---pthread_create()

15.2.1.1 编译运行

15.2.2向线程体函数传递参数---传递多个值

15.2.3线程访问资源的限制---共享主线程的数据段堆栈段

15.2.3.1 线程之间的栈是相互独立

15.2.4终止线程---pthread_exit() pthread_join()

15.2.5正确得到线程退出信息的方法

15.2.6取消一个线程的执行---pthread_cancel()

15.2.7线程退出函数---pthread_cleanup_push() pthread_cleanup_pop()

线程高级操作

线程同步---使用互斥量

16.1.1初始化与销毁互斥量---pthread_mutex_init() pthread_mutex_destory()

16.1.2得到与释放互斥量---pthread_mutex_lock() pthread_mutex_trylock() pthread_mutex_unlock()

16.1.3线程互斥量的属性

线程同步---使用读写锁

16.2.1初始化与销毁读写锁---pthread_rwlock_init() pthread_rwlock_destory()

16.2.2得到与释放互斥锁---pthread_rwlock_rdlock() pthread_rwlock_tryrdlock()

16.3条件变量

16.3.1创建和注销

16.3.2等待和激发

16.3.3调用pthread_cond_wait()的正确方式

16.3.4一个简单例子---解释pthread_cond_signal() 使条件变量有效

16.3.5一个简单例子---解释pthread_cond_wait()的调用存在对锁的释放和加锁

16.3.6一个简单例子---实现生产者和消费这的问题

16.3.7条件变量的重要作用---指定某个线程得到互斥信号量

16.4POSIX线程取消点---引出pthread_testcancel()的作用

16.4.1一个例子---在while(1) 中设置取消点

16.4.2放置取消点

16.4.3取消线程操作函数-pthread_setcancelstate()

16.4.4取消函数 pthrad_cancel() 更详细描述

16.4.5一个执行pthread_cancel(); 从pthread_cond_wait() 退出导致死锁

16.5信号量

16.5.1信号量与互斥量的联系

16.5.2信号量PV操作（转）

16.5.3信号量PV操作（转）--- 生产者和消费者使用两个信号量

16.5.4信号量PV操作（转）--- 同时使用信号量和互斥量

16.5.5Linux信号量的操作---无名信号量

16.5.5.2 sem_destroy()

16.5.5.3 sem_wait() sem_trywait()

16.5.5.4 sem_post()

16.5.5.5 sem_getvalue()

16.5.5.6例子---使用sem_wait 和 sem_pos 同步

16.5.5.7例子---使用sem_getvalue()获取临界资源计数器

16.5.6Linux信号量的操作---有名信号量

16.5.6.1sem_open()

16.5.6.2sem_close()

16.5.6.3sem_unlink()

16.5.6.4例子 --- 创建信号量

16.5.6.5例子 --- 等待信号量

16.5.6.6例子 --- 释放信号量

16.5.6.7例子 --- 测试结果

发表评论

最新留言

关于作者

推荐文章