1521 Lec 11 - Concurrency, Parallelism, Threads
1521 Lec 11 - Concurrency, Parallelism, Threads
https://www.cse.unsw.edu.au/~cs1521/24T1/
• Concurrency vs Parallelism
• Flynn’s taxonomy
• Threads in C
• Atomics
Concurrency:
multiple computations in overlapping time periods …
does not have to be simultaneous
Parallelism:
multiple computations executing simultaneously
The web server providing the class website uses process-level parallelism
#include <pthread.h>
int pthread_create (
pthread_t *thread,
const pthread_attr_t *attr,
void *(*thread_main)(void *),
void *arg);
• Thread has attributes specified in attr (NULL if you want no special attributes).
• analogous to posix_spawn(3)
• analogous to waitpid(3)
• terminates the execution of the current thread (and frees its resources)
• analagous to exit(3)
#include <pthread.h>
#include <stdio.h>
// This function is called to start thread execution.
// It can be given any pointer as an argument.
void *run_thread(void *argument) {
int *p = argument;
for (int i = 0; i < 10; i++) {
printf("Hello this is thread #%d: i=%d\n", *p, i);
}
// A thread finishes when either the thread's start function
// returns, or the thread calls `pthread_exit(3)'.
// A thread can return a pointer of any type --- that pointer
// can be fetched via `pthread_join(3)'
return NULL;
}
source code for two_threads.c
int main(void) {
// Create two threads running the same task, but different inputs.
pthread_t thread_id1;
int thread_number1 = 1;
pthread_create(&thread_id1, NULL, run_thread, &thread_number1);
pthread_t thread_id2;
int thread_number2 = 2;
pthread_create(&thread_id2, NULL, run_thread, &thread_number2);
// Wait for the 2 threads to finish.
pthread_join(thread_id1, NULL);
pthread_join(thread_id2, NULL);
return 0;
}
source code for two_threads.c
struct job {
long start, finish;
double sum;
};
void *run_thread(void *argument) {
struct job *j = argument;
long start = j->start;
long finish = j->finish;
double sum = 0;
for (long i = start; i < finish; i++) {
sum += i;
}
j->sum = sum;
source code for thread_sum.c
double overall_sum = 0;
for (int i = 0; i < n_threads; i++) {
pthread_join(thread_id[i], NULL);
overall_sum += jobs[i].sum;
}
printf("\nCombined sum of integers 0 to %lu is %.0f\n", integers_to_sum,
overall_sum);
return 0;
source code for thread_sum.c
Seconds to sum the first 1e+10 (10,000,000,000) integers using double arithmetic,
with 𝑁 threads, on some different machines…
host 1 2 4 12 24 50 500
int main(void) {
pthread_t thread_id1;
int thread_number = 1;
pthread_create(&thread_id1, NULL, run_thread, &thread_number);
thread_number = 2;
pthread_t thread_id2;
pthread_create(&thread_id2, NULL, run_thread, &thread_number);
pthread_join(thread_id1, NULL);
pthread_join(thread_id2, NULL);
return 0;
}
source code for two_threads_broken.c
• variable thread_number will probably change in main, before thread 1 starts executing…
• ⟹ thread 1 will probably print Hello this is thread 2 … ?!
int bank_account = 0;
// add $1 to Andrew's bank account 100,000 times
void *add_100000(void *argument) {
for (int i = 0; i < 100000; i++) {
// execution may switch threads in middle of assignment
// between load of variable value
// and store of new variable value
// changes other thread makes to variable will be lost
nanosleep(&(struct timespec){ .tv_nsec = 1 }, NULL);
// RECALL: shorthand for `bank_account = bank_account + 1`
bank_account++;
}
return NULL;
}
source code for bank_account_broken.c
int main(void) {
// create two threads performing the same task
pthread_t thread_id1;
pthread_create(&thread_id1, NULL, add_100000, NULL);
pthread_t thread_id2;
pthread_create(&thread_id2, NULL, add_100000, NULL);
// wait for the 2 threads to finish
pthread_join(thread_id1, NULL);
pthread_join(thread_id2, NULL);
// will probably be much less than $200000
printf("Andrew's bank account has $%d\n", bank_account);
return 0;
}
source code for bank_account_broken.c
We don’t want two processes in the critical section — we must establish mutual exclusion.
For example:
pthread_mutex_lock (&bank_account_lock);
andrews_bank_account += 1000000;
pthread_mutex_unlock (&bank_account_lock);
int bank_account = 0;
pthread_mutex_t bank_account_lock = PTHREAD_MUTEX_INITIALIZER;
// add $1 to Andrew's bank account 100,000 times
void *add_100000(void *argument) {
for (int i = 0; i < 100000; i++) {
pthread_mutex_lock(&bank_account_lock);
// only one thread can execute this section of code at any time
bank_account = bank_account + 1;
pthread_mutex_unlock(&bank_account_lock);
}
return NULL;
}
source code for bank_account_mutex.c
pthread_mutex_lock(&andrews_bank_account_lock);
pthread_mutex_lock(&xaviers_bank_account_lock);
pthread_mutex_lock(&xaviers_bank_account_lock);
pthread_mutex_lock(&andrews_bank_account_lock);
fetch_add: n += value
fetch_sub: n -= value
fetch_or: n |= value
fetch_xor: n ^= value
compare_exchange:
if (n == v1) {
n = v2;
}
return n;
• With mutexes, a program can lock mutex A, and then (before unlocking A) lock some mutex B.
• Atomic instructions are (by definition!) atomic, so there’s no equivalent to the above problem.
• Goodbye deadlocks!
• Non-blocking: If a thread fails or is suspended, it cannot cause failure or suspension of another thread.
#include <stdatomic.h>
atomic_int bank_account = 0;
// add $1 to Andrew's bank account 100,000 times
void *add_100000(void *argument) {
for (int i = 0; i < 100000; i++) {
// NOTE: This *cannot* be `bank_account = bank_account + 1`,
// as that will not be atomic!
// However, `bank_account++` would be okay
// and, `atomic_fetch_add(&bank_account, 1)` would also be okay
bank_account += 1;
}
return NULL;
}
source code for bank_account_atomic.c
• Although faster and simpler than traditional locking, there is still a performance penalty using atomics (and
increases program complexity).
• Can be incredibly tricky to write correct code at a low level (e.g. memory ordering, which we won’t cover in
COMP1521).
• When sharing data with a thread, we can only pass the address of our data.
• what if by the time the thread reads the data, that data no longer exists?
• but what if we pass data with a lifetime shorter than the thread lifetime?
• it changes the stack memory which used to hold super_special_number (by using it for greeting)
• Other fun concurrency problems/concepts: livelock, starvation, thundering herd, memory ordering, semaphores,
software transactional memory, user threads, fibers, etc.