What is the difference between Go's multithreading approach and other approaches, such as pthread, boost::thread or Java Threads?
Quoted from Day 3 Tutorial <- read this for more information.
Goroutines are multiplexed as needed onto system threads. When a goroutine executes a blocking system call, no other goroutine is blocked.
We will do the same for CPU-bound goroutines at some point, but for now, if you want user-level parallelism you must set $GOMAXPROCS. or call runtime.GOMAXPROCS(n).
A goroutine does not necessarily correspond to an OS thread. It can have smaller initial stack size and the stack will grow as needed.
Multiple gorouitines may be multiplexed into a single thread when needed.
More importantly, the concept is as outlined above, that a goroutine is a sequential program that may block itself but does not block other goroutines.
Goroutines is implemented as pthreads in gccgo, so it can be identical to OS thread, too. It's separating the concept of OS thread and our thinking of multithreading when programming.
In the reference compilers (5g/6g/8g), the master scheduler (src/pkg/runtime/proc.c) creates N OS threads, where N is controlled by runtime.GOMAXPROCS(n) (default 1). Each scheduler thread pulls a new goroutine off the master list and starts running it. The goroutine(s) will continue to run until a syscall is made (e.g. printf) or an operation on a channel is made, at which point the scheduler will grab the next goroutine and run it from the point at which it left off (see gosched() calls in src/pkg/runtime/chan.c).
The scheduling, for all intents and purposes, is implemented with coroutines. The same functionality could be written in straight C using setjmp() and longjmp(), Go (and other languages that implement lightweight/green threads) are just automating the process for you.
The upside to lightweight threads is since it's all userspace, creating a "thread" is very cheap (allocating a small default stack) and can be very efficient due to the inherent structure of how the threads talk to eachother. The downside is that they are not true threads which means a single lightweight thread can block the entire program, even when it appears all the threads should be running concurrently.