Risc-v, which is also an open source instruction set, has been accepted and even adopted by more and more companies, and some people in the industry regard it as a strong opponent of arm.
Now there is another giant for risc-v platform, Samsung.
At the risc-v summit held in Silicon Valley on the 10th, Samsung said publicly that risc-v would take the lead in its 5g millimeter wave RF IC.
According to the information disclosed by Samsung, in 2017, Samsung's first risc-v RF test chip was streamed. After more than three years of testing, it has become more and more mature. It is planned to be commercially available on its flagship 5g mobile phone in 2020.
RF chip is just the beginning. According to Samsung's plan, risc-v will penetrate into the fields of image processor (CMOS), safety chip, auto driving and so on.
In addition to announcing risc-v kernel integration, this is the first time Samsung has publicly talked about its mmwave module.
Unlike Qualcomm, Samsung tends to keep silent on its development until it is about to be commercialized, so we don't know their design in many cases.
However, it seems that the company has been designing mmwave for some time.
Since then, Samsung has become one of the fourth major arm customers publicly confirmed to turn to risc-v, followed by Western Digital, NVIDIA and Qualcomm.
Western Digital plans to write SSD master with risc-v, NVIDIA plans to write GPU controller, and Qualcomm is used for mobile SOC.
The impact of risc-v architecture on China's RF industry
From the current environment, risc-v architecture has attracted attention in China. Since 2018, there has been an upsurge of discussion on risc-v in the semiconductor technology circle of China. In view of this trend, China also released the first risc-v support policy last year and established China risc-v industry alliance. It is reported that at present, the alliance has attracted more than 100 enterprises including Ziguang zhanrui, Huada and Jingchen, as well as more than 10 universities and scientific research institutions such as Fudan and Jiaotong University.
This time, Samsung announced the application of risc-v in the field of wireless RF, which has brought a good demonstration to Chinese RF chip manufacturers.
The vitality of risc-v will further promote the independent development of China's RF semiconductor industry.
1、 How is risc-v different from other open architectures
If we only judge from the two points of "free" or "open", risc-v architecture is not the first processor architecture to be free or open.
Before we begin, we first discuss several representative open architectures to analyze the differences of risc-v architecture and why other open architectures have not achieved enough success.
1.1 civilian hero - openrisc
Openrisc is an open source RISC processor based on GPL protocol provided by opencores organization.
Openrisc has the following features:
Openrisc has been applied to the projects of many companies. It can be said that openrisc is a widely used open source processor implementation.
The disadvantage of openrisc is that it focuses on implementing an open source CPU core rather than defining an open instruction set architecture. Therefore, the development of its architecture is not complete, and the definition of instruction set does not have the advantages of risc-v architecture mentioned in the previous section, let alone the height of establishing a special foundation organization. Openrisc is more often regarded as an open source core than a beautiful instruction set architecture. In addition, the license of openrisc is GPL, which means that all instruction set changes must be open source (risc-v does not have this constraint).
1.2 giants and dignitaries SPARC
SPARC architecture is one of the classic RISC microprocessor architectures. SPARC was first designed by sun computer in 1985. SPARC is also one of the registered trademarks of SPARC international, which was established in 1989 to promote SPARC architecture to the outside world and test its compatibility. In order to promote SPARC's ecosystem, SPARC international opened the standard and authorized it to be adopted by many manufacturers, including Texas Instruments, Cypress Semiconductor and Fujitsu. Since SPARC architecture is also fully open to the outside world, Leon processors with fully open source also appear. In addition, sun also promoted SPARC V8 architecture to become IEEE standard (IEEE standard 1754-1994) in 1994.
Because the SPARC architecture is originally designed for the server field, its biggest feature is that it has a large register window. The processor in line with SPARC architecture needs to realize as many as 72 to 640 general registers, each register width is 64bits, forming a series of register groups, called register windows.
This register window architecture can switch different register groups and quickly respond to function calls and returns. Therefore, it can produce very high performance. However, this architecture is not suitable for PC and embedded processors because of its high power consumption and area cost. SPARC architecture also does not have the characteristics of modularity, which makes users unable to cut and choose. It is difficult to replace the commercial x86 and arm architecture as a general processor architecture.
Designing such a large server CPU chip is not something that ordinary companies and individuals can get involved in, and companies that can design such a large CPU do not need to invest a huge cost to challenge the dominance of X86. With the decline of sun, SPARC architecture has basically withdrawn from people's vision. Interested readers please search the article "goodbye SPARC processor, goodbye sun" on the Internet
1.3 elite students - risc-v
About the birth of risc-v at Berkeley University
In 2017, two pioneers of modern computer architecture, John Hennessy and David Patterson, won the 2017 ACM Turing Award respectively. They themselves are the initiators and promoters of risc-v technology.
The two leaders also entered Google.
Dissatisfied with the complexity of processor architectures such as arm and the restrictions of relevant intellectual property rights, driven by the two masters, Berkeley University decided to invent a new instruction set architecture, which can be freely used by any academic institution or commercial organization. The textbooks compiled by these two masters, led by the United States, are using risc-v as college textbooks all over the world.
Globally, risc-v is also defined as a national standard instruction set by many countries, such as India.
It has also attracted a lot of attention in the industry. For example, Samsung has made it clear that it will use risc-v in related products.
In 2016, risc-v established the foundation, and start-ups include Google, western data, Taiwan Jingxin, MediaTek, Hangzhou Zhongtian, Huawei, etc.
Over the years, there have been many free or open architectures in the field of CPU, and many universities have also launched a variety of instruction set architectures in scientific research projects. Therefore, when I first heard about risc-v, I thought it was a toy or a purely academic scientific research project.
Until I personally read through the risc-v architecture document, I couldn't help being impressed by its advanced design concept. At the same time, the advantages of risc-v architecture have also been favored by many professionals and joined by many commercial companies. In addition, the official launch of risc-v foundation in 2016 had a great impact on the industry. All these make risc-v the most revolutionary open processor architecture so far.
2、 Simplicity is beauty -- design philosophy of risc-v architecture
Risc-v architecture is an instruction set architecture. Before introducing the details, let's understand the philosophy of design. The so-called "philosophy" of design is a strategy advocated by it. For example, the well-known design philosophy of Japanese cars is economy and fuel saving, and the design philosophy of American cars is domineering and leakage. What is the design philosophy of risc-v architecture? It's "avenue to Jane".
One of the most respected design principles of the author is: simplicity is beauty, and simplicity means reliability. Numerous practical cases have proved the truth that "simplicity means reliability". On the contrary, the more complex the machine is, the more likely it is to make mistakes.
In the practical work of IC design, the author has seen that the simplest design is safe and reliable, and the most complex design cannot converge stably for a long time. The most concise design is often the most reliable, which has been tested again and again in most project practice.
The working nature of IC design is very special. Its final output is chip. The design and manufacturing cycle of a chip is very long, so it can not be upgraded and patched as easily as software code. It takes several months for each chip revision to delivery. Moreover, the one-time manufacturing cost of chips is high, ranging from hundreds of thousands of dollars to millions of dollars. These characteristics determine that the trial and error cost of IC design is very high, so it is very important to effectively reduce the occurrence of errors.
The scale and complexity of modern chip design are becoming larger and larger. It does not mean that designers are required to blindly avoid using complex technologies, but should use good steel on the blade, use the most complex design in the most critical scenarios, and try to choose a simple implementation scheme in most selective cases.
When I first read the risc-v architecture document, I couldn't help but praise and surprise, because risc-v architecture constantly clearly emphasizes in its document that its design philosophy is "great road to simplicity", trying to make the implementation of hardware simple enough through the definition of architecture. Its simplicity is the philosophy of beauty, which can be easily seen from several aspects. The following sections will discuss it one by one.
2.1 no illness and light body - length of structure
In the field of processor, the current mainstream architecture is x86 and arm architecture. The author has participated in the design of application processor of arm architecture, so you need to read the architecture document of arm. If you are familiar with it, you should understand its length. After decades of development, the architecture documents of modern x86 and arm architecture are hundreds of thousands of pages. Printing can be half a table high, which is really "equal to writing".
One of the main reasons why the documents of modern x86 and arm architecture are thousands of pages and many versions is that the development process of its architecture is also accompanied by the continuous development and maturity of modern processor architecture technology.
Moreover, as a commercial architecture, in order to maintain the backward compatibility of the architecture, it has to retain many outdated definitions, or it is very awkward to cope with the existing technical parts when defining new architecture parts. Over time, it becomes extremely lengthy.
Then, it is almost impossible for a modern mature architecture to choose to start over and redefine a concise architecture. One of the important reasons is that it is not forward compatible, so it can not be accepted by users. Imagine if we buy a new computer or mobile phone with a new processor and go home, and all the software can't run before and turn bricks, it must be unacceptable.
The risc-v architecture just launched now has the advantage of late development. Because the computer architecture has become a relatively mature technology after years of development, and the problems exposed in the process of continuous maturity over the years have been thoroughly studied, the new risc-v architecture can be avoided without the historical burden of backward compatibility. It can be said that it is disease-free and light.
The current "risc-v architecture document" is divided into "instruction set document" (riscv-spec-v2.2. PDF) and "privileged architecture document" (riscv-privileged-v1.10. PDF). The instruction set document is 145 pages long, while the privilege architecture document is only 91 pages long. Engineers familiar with the architecture can read it through in only one to two days. Although the "risc-v architecture document" is still constantly enriched, compared with the "x86 architecture document" and "arm architecture document", the length of risc-v can be said to be extremely short and concise.
2.2 flexibility - modular instruction set
The biggest difference between risc-v architecture and other mature commercial architectures is that it is a modular architecture. Therefore, risc-v architecture is not only short and concise, but also its different parts can be organized together in a modular way, so as to try to meet a variety of different applications through a unified architecture.
This modularity is not available in x86 and arm architectures. Taking the architecture of arm as an example, the architecture of arm is divided into three series a, R and m, which are respectively aimed at the three fields of application (application operating system), real time (real-time) and embedded (embedded), and are not compatible with each other.
However, the modular risc-v architecture enables users to flexibly select different module combinations to meet different application scenarios, which can be said to be "suitable for all ages". For example, for small-area Low-Power Embedded scenarios, users can select the instruction set of rv32ic combination and only use machine mode; In the scenario of high-performance application operating system, for example, the instruction set of rv32imfdc can be selected, using two modes: machine mode and user mode. Their common parts can be compatible with each other.
2.3 the essence of concentration is the quantity of instructions.
The short and concise architecture and modular philosophy make the number of instructions in risc-v architecture very concise. The number of basic risc-v instructions is only more than 40, plus other modular extension instructions, a total of dozens of instructions.
3、 Introduction to risc-v instruction set architecture
This chapter will briefly introduce the characteristics of risc-v instruction set architecture.
3.1 modular instruction subset
The instruction set of risc-v is organized in a modular way, and each module is represented by an English letter. The most basic and only mandatory instruction set of risc-v is the basic integer instruction subset represented by I letter. Using this integer instruction subset, a complete software compiler can be realized. Other instruction subsets are optional modules. Representative modules include M / A / F / D / C, as shown in Table 1.
Table 1 modular instruction set of risc-v
In order to improve the code density, risc-v architecture also provides an optional "compressed" instruction subset, which is represented by the English letter C. The instruction encoding length of compressed instructions is 16 bits, while the length of ordinary uncompressed instructions is 32 bits. A specific combination of these modules "imafd", also known as "general" combination, is represented by the English letter G. Therefore, rv32g represents rv32imafd. Similarly, rv64g represents rv64imafd.
In order to further reduce the area, risc-v architecture also provides an "embedded" architecture, which is represented by the English letter E. The architecture is mainly used in deep embedded scenarios that pursue very low area and power consumption. The architecture only needs to support 16 general-purpose integer registers, while the non embedded general-purpose architecture needs to support 32 general-purpose integer registers.
Through the above modular instruction set, different combinations can be selected to meet different applications. For example, rv32ec architecture can be selected for embedded scenarios pursuing small area and low power consumption; Rv64g can be selected for large 64 bit architecture.
In addition to the above modules, there are several modules, including L, B, P, V and t. Most of these extensions are still being improved and defined, and have not been finalized, so this paper will not discuss them in detail.
3.2 configurable general register group
Risc-v architecture supports 32-bit or 64 bit architecture. The 32-bit architecture is represented by rv32, and the width of each general register is 32 bits; The 64 bit architecture is represented by rv64, and the width of each general-purpose register is 64 bits.
The risc-v architecture's integer general-purpose register group includes 32 (I Architecture) or 16 (e Architecture) general-purpose integer registers, in which integer register 0 is reserved as constant 0, and the other 31 (I Architecture) or 15 (e Architecture) are general-purpose integer registers.
If a floating-point module (f or D) is used, another independent floating-point register group containing 32 general-purpose floating-point registers is required. If only the floating-point instruction subset of the f module is used, the width of each general-purpose floating-point register is 32 bits; If a floating-point instruction subset of the D module is used, the width of each general-purpose floating-point register is 64 bits.
3.3 regular instruction coding
In the pipeline, it is often one of the expectations of processor pipeline design to read the general register group as soon as possible, which can improve processor performance and optimize timing. This seemingly simple truth is difficult to realize in many existing commercial RISC architectures, because after years of repeated modification and continuous addition of new instructions, the register index position in the instruction coding becomes very messy, which places a burden on the decoder.
Thanks to the advantages of latecomers and the lessons of processor development over the years, the instruction set coding of risc-v is very regular, and the indexes of general registers required by instructions are placed in fixed positions, as shown in Figure 2. Therefore, the instruction decoder can easily decode the register index and read the general register file (regfile).
Figure 2 rv32i regular instruction coding format
3.4 concise memory access instructions
Like all RISC processor architectures, risc-v architecture uses special memory read (load) instructions and memory write (store) instructions to access memory, and other ordinary instructions cannot access memory. This architecture is a common basic strategy of RISC architecture, which makes the hardware design of processor core simple.
The basic unit of memory access is byte. Risc-v's memory read and write instructions support memory read and write operations in units of one byte (8 bits), half word (16 bits) and single word (32 bits). If it is a 64 bit architecture, it can also support memory read and write operations in units of double word (64 bits).
The memory access instruction of risc-v architecture also has the following remarkable features:
In order to improve the performance of memory read and write, risc-v architecture recommends the use of address aligned memory read and write operations, but address non aligned memory operations are also supported by risc-v architecture. The processor can be supported by hardware or software.
Since the current mainstream application is the small endian format, risc-v architecture only supports the small endian format. The definitions and differences between small end format and large end format are not introduced here. Beginners who do not know much about this can consult and learn by themselves.
Many RISC processors support the address self increasing or self decreasing mode. Although this self increasing or self decreasing mode can improve the performance of the processor accessing the address range of continuous memory, it also increases the difficulty of designing the processor. The memory read and write instructions of risc-v architecture do not support the address self increasing and self decreasing mode.
Risc-v architecture adopts the relaxed memory model. The loose memory model does not require the execution sequence of memory read-write instructions accessing different addresses, unless shielded by explicit memory barrier instructions.
These choices clearly reflect the philosophy that risc-v architecture tries to simplify the basic instruction set and thus simplify hardware design. Risc-v architecture is so reasonably defined that it can achieve the effect of flexibility. For example, for a simple CPU with low power consumption, a very simple hardware circuit can be used to complete the design; For high-performance superscalar processors, the dynamic hardware scheduling ability of complex design can improve the performance.
3.5 efficient branch jump instruction
Risc-v architecture has two unconditional jump instructions, JAL and jalr. The jump and link instruction JAL can be used to call the subroutine, and store the return address of the subroutine in the link register (link register: served by a general integer register). Jump and link register instruction the jalr instruction can be used for the return instruction of the subroutine. By using the link register saved by the JAL instruction (jump into the subroutine) for the base address register of the jalr instruction, it can be returned from the subroutine.
Risc-v architecture has six conditional branch instructions. Like ordinary operation instructions, this conditional branch instruction directly uses two integer operands, and then compares them. If the comparison conditions are met, it will jump. Therefore, this kind of instruction puts the comparison and jump operations into one instruction.
For comparison, many other RISC processors need to use two independent instructions. The first instruction uses the comparison instruction first, and the comparison result is saved in the status register; The second instruction uses the jump instruction. When the comparison result of the previous instruction saved in the status register is true, the jump is performed. In contrast, this conditional jump instruction of risc-v not only reduces the number of instructions, but also makes the hardware design simpler.
For the low-end CPU without hardware branch predictor, in order to ensure its performance, the risc-v architecture clearly requires it to adopt the default static branch prediction mechanism, that is, if it is a conditional jump instruction for backward jump, it is predicted as "jump"; If it is a conditional jump instruction to jump forward, it is predicted to be "no jump", and risc-v architecture requires the compiler to compile and generate assembly code according to this default static branch prediction mechanism, so that the low-end CPU can also get good performance.
In order to make the hardware design as simple as possible, risc-v architecture specifically defines that the offset of the jump target of all conditional jump instructions (relative to the address of the current instruction) is a signed number, and its symbol bits are encoded in a fixed position. Therefore, this static prediction mechanism is very easy to implement in hardware. The hardware decoder can easily find this fixed position and judge whether it is 0 or 1 to judge whether it is a positive or negative number. If it is a negative number, it means that the target address of the jump is the current address minus the offset, that is, the backward jump, and it is predicted as "jump". Of course, for high-end CPUs equipped with hardware branch predictor, advanced dynamic branch prediction mechanism can be used to ensure performance.
3.6 concise subroutine call
In order to understand this section, it is necessary to introduce the process of calling sub functions in the general RISC architecture. The process is as follows:
After entering the sub function, you need to use the memory write (store) instruction to save the current context (value of general register, etc.) to the stack area of the system memory. This process is usually called "save field".
When exiting the subroutine, it is necessary to use the memory read (load) instruction to read the previously saved context (value of general register, etc.) from the stack area of the system memory. This process is usually called "recovery field".
The process of "saving the scene" and "restoring the scene" is usually completed by the instructions generated by the compiler. Developers who use high-level languages (such as C or C) can not care much about it. A sub function call can be written directly in the program of the high-level language, but the process of "saving the scene" and "restoring the scene" at the bottom actually occurs (you can see the assembly instructions of "saving the scene" and "restoring the scene" from the compiled assembly language), and it also needs to consume some CPU execution time.
In order to speed up the process of "saving the site" and "restoring the site", some RISC architectures have invented instructions that write multiple registers to the memory at one time or read multiple registers from the memory at one time. The advantage of such instructions is that one instruction can complete many things, so as to reduce the amount of code of assembly instructions, Save code space. However, the disadvantage of this "load multiple" and "store multiple" is that it will complicate the hardware design of the CPU, increase the hardware overhead, and may also damage the timing, so that the main frequency of the CPU can not be improved. The author suffered deeply when designing this kind of processor.
Risc-v architecture abandons the use of such "load multiple" and "store multiple" instructions. It also explains that if you care about the number of "save site" and "restore site" instructions on some occasions, you can use a public program library (specifically for saving and restoring site), so that you can avoid placing a different number of "save site" and "restore site" instructions in the process of each sub function call.
This choice once again confirms risc-v's philosophy of pursuing hardware simplicity, because abandoning the "load multiple" and "store multiple" instructions can greatly simplify the hardware design of the CPU. For the CPU with low power consumption and small area, a very simple circuit can be selected for implementation, while the high-performance excessive processor has strong dynamic scheduling ability, There can be a powerful branch prediction circuit to ensure that the CPU can jump and execute quickly, so you can choose to use a public library (specially used to save and restore the site) to reduce the amount of code, but achieve high performance at the same time.
3.7 unconditional code execution
Many early RISC architectures invented instructions with condition codes. For example, the first few bits of the instruction code represent the condition code. The instruction is really executed only when the condition corresponding to the condition code is true.
This form of encoding condition code into instructions allows the compiler to compile short loops into instructions with condition code instead of branch jump instructions. In this way, the occurrence of branch jump is reduced, on the one hand, the number of instructions is reduced; On the other hand, it also avoids the performance loss caused by branch jump. However, the disadvantages of this "condition code" instruction will also complicate the hardware design of the CPU, increase the hardware overhead, and may also damage the timing, so that the main frequency of the CPU can not be improved. The author suffered deeply when designing this kind of processor.
Risc-v architecture abandons the use of "condition code" instructions, and uses ordinary conditional branch jump instructions for any condition judgment. This choice once again confirms risc-v's philosophy of pursuing hardware simplicity, because abandoning the instruction with "condition code" can greatly simplify the hardware design of CPU. For CPU with low power consumption and small area, very simple circuit can be selected for implementation, and high-performance superscalar processor has strong dynamic scheduling ability, It can have a powerful branch prediction circuit to ensure that the CPU can jump quickly and achieve high performance.
3.8 no branch delay slot
Many early RISC architectures used "branch delay slot", and the most representative is MIPS architecture. MIPS was used to introduce branch delay slot in many classic computer architecture textbooks. Branch delay slot means that one or several instructions immediately following each branch instruction are not affected by branch jump. Regardless of whether the branch jumps or not, the following instructions will be executed.
Many early RISC architectures used branch delay slots. The main reason for the birth was that the processor pipeline was relatively simple and did not use advanced hardware dynamic branch predictor, so using branch delay slots can achieve considerable performance results. However, this branch delay slot makes the hardware design of CPU very awkward, and CPU designers often suffer from it.
Risc-v architecture abandons the branch delay slot, which once again confirms the philosophy of risc-v trying to simplify the hardware, because the branch prediction algorithm accuracy of modern high-performance processors has been very high, and there can be a powerful branch prediction circuit to ensure that the CPU can accurately predict the jump execution to achieve high performance. For the CPU with low power consumption and small area, the hardware is greatly simplified because it does not need to support branch delay slot, which can further reduce power consumption and improve timing.
3.9 zero overhead hardware loop
Many RISC architectures also support zero overhead hardware loop instructions. The idea is to let the program cycle automatically by setting some loop count registers through the direct participation of the hardware. Each cycle will automatically reduce the loop count by 1. In this way, the cycle will continue until the value of loop count becomes 0, and then exit the cycle.
The reason why this hardware assisted zero overhead loop is proposed is that the for loop (for I = 0; I < n; I) is very common in software code, and this software code is often compiled into several addition instructions and conditional branch jump instructions after being compiled by the compiler, so as to achieve the effect of loop. On the one hand, these addition and conditional jump instructions occupy the number of instructions; On the other hand, conditional branch jump has the performance problem of branch prediction. The hardware assisted zero overhead loop directly completes these tasks by the hardware, eliminates these addition and conditional jump instructions, reduces the number of instructions and improves the performance<= "" p="">
However, there are gains and losses. Such zero overhead hardware loop instructions greatly increase the complexity of hardware design. Therefore, the zero overhead loop instruction is completely opposite to the philosophy of simplifying hardware in risc-v architecture. Naturally, this kind of zero overhead hardware loop instruction is not used in risc-v architecture.
3.10 concise operation instructions
In Section 2.1 of this chapter, it was mentioned that risc-v architecture organizes different instruction subsets in a modular way. The most basic integer instruction subset (represented by I letter) supports operations including addition, subtraction, shift, bitwise logic operation and comparison operation. These basic operations can complete more complex operations (such as multiplication and division and floating-point operations) through combination or function library, so as to complete most software operations.
The operations supported by the subset of integer multiplication and division instructions (represented by M letters) include signed or unsigned multiplication and division operations. The multiplication operation can support the multiplication of two 32-bit integers to obtain a 64 bit result; The division operation can support the division of two 32-bit integers to obtain a 32-bit quotient and a 32-bit remainder.
The operations supported by single precision floating-point instruction subset (represented by F letter) and double precision floating-point instruction subset (represented by D letter) include floating-point addition and subtraction, multiplication and division, multiplication and accumulation, square root and comparison, and provide format conversion between integer and floating-point, single precision and double precision floating-point.
Many RISC processors will generate software exceptions when computing instructions generate errors, such as overflow, underflow, denormalized floating-point numbers and divide by zero. A special feature of risc-v architecture is that it does not generate exceptions for any operation instruction errors (including integer and floating-point instructions), but generates a special default value, and sets the status bits of some status registers. Risc-v architecture recommends that software find these errors by other methods. Thirdly, it clearly reflects the philosophy that risc-v architecture tries to simplify the basic instruction set, so as to simplify the hardware design.
3.11 elegant compressed instruction subset
The instruction length specified by the basic risc-v basic integer instruction subset (represented by the letter I) is 32 bits of the same length. This definition of the same length instruction makes it very easy to design the basic risc-v CPU that only supports the integer instruction subset. However, equal length 32-bit encoding instructions will also cause the problem of relatively large code size.
In order to meet some scenarios that require high code volume (such as embedded field), risc-v defines an optional subset of compressed instructions, which can be represented by the letter C or RVC. Risc-v has the advantage of backwardness. It plans to compress instructions from the beginning and reserves enough coding space. 16 bit long instructions and ordinary 32-bit long instructions can be seamlessly and freely interleaved, and the processor does not define additional states.
Another special feature of risc-v compression instruction is that the compression strategy of 16 bit instruction is to compress and rearrange the information in some common and most commonly used 32-bit instructions (for example, if an instruction uses two identical operand indexes, the coding space of one index can be omitted), Therefore, each 16 bit instruction can find its corresponding original 32-bit instruction one by one. Therefore, the program can be compiled into compressed instructions only in the assembler stage, which greatly simplifies the burden of the compiler tool chain.
Risc-v architecture researchers conducted a detailed code volume analysis, as shown in Figure 3. Through the analysis results, it can be seen that the code volume of rv32c is reduced by 40% compared with rv32, and has a good performance compared with arm, MIPs, x86 and other architectures.
Figure 3 code density comparison of instruction set architectures (the smaller the data, the better)
3.12 privilege mode
Risc-v architecture defines three working modes, also known as privileged mode:
Machine mode: machine mode, abbreviated as M mode.
Supervisor mode: supervision mode, abbreviated as S mode.
User mode: user mode, abbreviated as u mode.
Risc-v architecture defines M mode as a required mode and the other two as optional modes. Different systems can be realized through different pattern combinations.
Risc-v architecture also supports several different memory address management mechanisms, including physical address and virtual address management mechanisms, so that risc-v architecture can support various systems from simple embedded systems (directly operating physical addresses) to complex operating systems (directly operating virtual addresses).
3.13 CSR register
Risc-v architecture defines some control and status registers (CSR) to configure or record some running status. CSR register is a register inside the processor core. Using its own address coding space has nothing to do with the address range addressed by the memory.
The access of CSR register adopts special CSR instructions, including csrrw, CSRRs, csrrc, csrrwi, csrrsi and csrrci instructions.
3.14 interruptions and exceptions
Interrupt and exception mechanism is often the most complex and key part of processor instruction set architecture. Risc-v architecture defines a relatively simple and basic interrupt and exception mechanism, but it also allows users to customize and extend it.
3.15 vector instruction subset
Although there is no finalized vector instruction subset in risc-v architecture, it can be seen from the current draft that the design concept of risc-v vector instruction subset is very advanced. Due to the late development advantage and the conclusion that the vector architecture has been developed for many years, risc-v architecture will use variable length vectors, Instead of vector fixed length SIMD instruction set (such as arm neon and Intel MMX), it can flexibly support different implementations. The CPU pursuing low power consumption and small area can choose to use the hardware vector with short length for implementation, while the CPU with high performance can choose the longer hardware vector for implementation, and the same software codes can be compatible with each other.
3.16 custom instruction extension
In addition to the scalability and selectivity of the modular instruction subset described above, risc-v architecture also has a very important feature, which is to support third-party expansion. Users can expand their own instruction subset. Risc-v reserves a large amount of instruction coding space for user-defined extension. At the same time, four custom instructions are defined for direct use by users. Each custom instruction has several bit sub coding space reserved. Therefore, users can directly use four custom instructions to expand dozens of custom instructions.
3.17 summary and comparison
After decades of development, with the development of large-scale integrated circuit design technology, processor design technology shows the following characteristics:
Because the hardware scheduling ability of high-performance processor has been very strong and the dominant frequency is very high, the hardware design hopes that the instruction set is as regular and simple as possible, so that the processor can design higher dominant frequency and lower area.
Very low power processors based on IOT applications are more demanding for low power consumption and low area.
Memory resources are also richer than early RISC processors.
Due to these factors, many early RISC architecture design concepts (born according to the technical background at that time) can not help modern processor design, but have become a burden shackle. On the one hand, the characteristics defined by some early RISC architectures make the hardware design of high-performance processors difficult; On the other hand, it makes the hardware design of very low-power processor bear unnecessary complexity.
Thanks to the late development advantage, the new risc-v architecture can avoid all these known burdens. At the same time, it uses its advanced design philosophy to design a set of "modern" instruction sets. This section summarizes its characteristics again, as shown in Table 2.
Table 2 summary of risc-v instruction set architecture features
This article is reproduced from“Microwave RF network”, support the protection of intellectual property rights. Please indicate the original source and author for reprint. If there is infringement, please contact us to delete.