Traditional Culture Encyclopedia - Hotel franchise - Redis IO model

Redis IO model

Our applications are all deployed in linux system, and linux system is also an application, which is operating system software based on computer hardware. When we receive a network transmission, the network card of the computer hardware will write the byte stream read into the buffer memory of linux from the network, and then the user space will call the exposed interface of linux to read the data in the buffer memory of linux into the user space. This read operation is an IO. The same is true of writing.

Different operating systems have different IO models. Here are some IO models of Linux system.

This is done to protect the Linux operating system and avoid external applications or manual direct operation of the kernel system.

The state of a thread running in user space is called user state.

The state of a thread running in kernel space is called kernel state.

Performance bottleneck of IO:

A. Switch between user mode and kernel mode (data copy)

B. Blocking wait of read and write threads

The IO model of linux is optimized for these two points.

When the user application thread calls the recvfrom function of linux operating system to read data, if there is no data in the buffer memory of the kernel, the user thread will block waiting until there is data in the buffer memory of the kernel, and then copy the data in the buffer memory of the kernel to the user application memory.

Similar to queuing to buy buns, there are no buns at this time, but you don't know if there are buns. If there is no steamed bread, you can only wait there and do nothing. When the steamed stuffed bun is baked, go get it and put it in your pocket.

Question:

When a thread is blocked, it will cause all subsequent threads to be blocked, and even if the subsequent reading and writing data is ready, it cannot be read or written.

When the user application thread calls the recvfrom function of linux operating system to read data, if there is no data in the buffer memory of the kernel, the user thread will directly get the result (no data) without waiting, so it will initiate another call to the recvfrom function until there is data in the buffer memory of the kernel, and then copy the data in the buffer memory of the kernel to the user application memory.

Similar to waiting in line to buy buns, the boss directly tells you that there are no buns. You already know the result. You asked the boss again and again if he had steamed bread. You can't take the steamed bread into your pocket until the boss tells you.

Question:

If there is no data all the time, the thread will call the recvfrom function indefinitely, which will frequently use CPU resources and cause a waste of CPU resources.

Kernel) uses file descriptors to access files. The file descriptor is a non-negative integer. When opening an existing file or creating a new file, the kernel will return a file descriptor. Reading and writing files also requires the use of file descriptors to specify the files to be read and written. Files include audio files, regular files, hardware devices, etc. , including network sockets.

IO multiplexing is to monitor more file descriptors FD with a single thread, and receive a notification when a file descriptor FD can be read and written, so as to avoid invalid waiting and make full use of CPU resources.

The user application thread calls the select function to listen to multiple FD file descriptors. If there is no data, we still have to wait. If there is a ready file FD, which means there is data, then read the corresponding FD ready file data. At this time, the kernel will copy the file FD set to the user's memory, then traverse the FD set to find the FD of the data that can be read, and then read it. After reading, it will copy the FD set into the kernel memory.

It's similar to queuing for food in a restaurant. At this time, a waiter monitored the kitchen through a flat panel. You just need to ask the waiter if he has anything to eat. If not, you still need to wait, but if he does, the waiter will know that there is food through monitoring and will let you order.

In fact, the principle of polling mode is similar to that of selecting mode. The difference is that an event event is added at the bottom of polling mode, which is divided into read event, write event, exception event and so on.

Process:

A. First, add the event to be monitored, whether it is a read event or a write event, which can be multiple events.

B. Convert the monitored event FD into a linked list and store it in the kernel buffer.

C, the kernel buffer copies the event FD linked list to the user buffer, and returns the number of ready FDs.

D, judging the number of ready FDs in the user cache, and if the number is greater than 0, starting the convenience event FD linked list.

Problems with polling mode:

A. You need to copy the entire FD linked list from user space to kernel space, and then copy it to user space after polling.

B.poll can't know which ready FD event it is, and needs to promote the whole FD event (linked list).

Contrast selection mode

Because of the use of linked lists, the number of events can be countless in theory, but with the increase of the number of events, the traversal performance of linked lists will decline, and when there are no prepared events, you still need to wait.

Epoll mode is improved on the basis of poll mode. Firstly, the FD linked list for storing events is changed to a red-black tree (there is no upper limit in theory), and the traversal performance of the red-black tree is stable. Secondly, the specific ready events are copied separately, and then copied to the user buffer, and the user buffer obtains the ready events without improving the traversal performance again.

Process:

A. First, register the monitoring event.

B. install all fds on red and black trees.

C. When the FD is ready, call the callback function to copy the corresponding FD into the linked list.

D, copying the linked list from the kernel buffer to the user buffer, and returning the linked list size n.

E. The user thread directly judges the size of n. When n is not 0, it can directly read the data of the linked list (all ready FD).

When the user application thread calls the sigaction function of linux operating system, it returns directly, and then the thread goes to do other things. When data arrives, kernel space will send a signal to user space. At this time, the user space will call the recvfrom function to copy the data from the kernel space buffer to the user space buffer and process the data.

Just like after you order, the waiter will give you a number, then your meal is ready, the waiter will call your number, and then you will come to pick up the meal.

Question:

When multithreading is called, the corresponding semaphore will increase, and the SIGIO function will not be processed in time, resulting in the overflow of the queue for storing signals; Moreover, kernel space and user space frequently interact with semaphores, resulting in poor performance.

In terms of performance, it is also good, that is, in actual development, it needs to control its thread concurrency, so it will be very troublesome to implement, so it is rarely used.

Among the three IO multiplexing methods, epoll has the best effect. The problems existing in the selection and polling modes are solved.

Redis is the IO model of epoll mode.