[MUSIC] The design of our processor is now completed. A VHDL model has been generated and several test benches have been executed. In this last lesson, we will make several comments about the design method we have used and about possible alternative methods. This is a summary of the method that we have used. First, we define an initial functional specification. Then, we translate this functional specification to a block diagram made up of five blocks,namely "input selection", "output selection", "register bank", "computational resources" and "go to" block. Each of the five blocks has been defined by its functional specification. Then we translated every block specification to an RTL model, (an RTL VHDL model). The five blocks, along with a very simple control unit, have been assembled, and in this way we got an RTL VHDL description of the complete processor. Then, it remained to synthesize and to implement the circuit, and for that synthesis and implementation automatic design tools have been used. Here are the implementation results using Xilinx FPGA, namely the xc3s500e device. We implemented the circuit. The number of flip flops is 200 and the number of 4-input lookup tables is 252. Actually, another design method could have been used. Starting from our initial functional specification, an RTL-level VHDL model could be directly generated, and then synthesis and implementation automatic design tools allow generating the circuit. Let us see an example of an RTL VHDL model directly deduced from the initial functional specification. First, as before, we generate a package that defines several constant values. Then, an entity declaration is generated; actually it's the same entity declaration as before, with the same inputs and outputs. Within the architecture, a user-defined type "memory" is defined, and a signal X is declared; it's the program memory. Another user-defined type "IOPort" has also been defined, and signals InputPort and OutputPort are declared. The first part of the architecture is only a list of connections such as: connect bit 0 of InputPort to IN0, and so on. The same for the outputs OUT0, OUT1 and so on. Or also define i as being the natural obtained by converting to an integer bits number 11 down to 8 of the instruction, and the same in the case of j and k. The main part of the processor RTL description is a process very similar to the functional specification of the processor (the initial one). The main difference with the functional specification is the explicit synchronization of the operations. An instruction is executed in every clock period. Thus this is an RTL description, not just a functional specification. This process defined a kind of finite state machine, plus operations associated with the instruction type. The states of this finite state machine are all the instruction numbers, and the instruction type, that is bits 15 down to 12 of the instruction, defines the corresponding operation and the new number value (the new state of the finite state machine). For example, in the case of a JUMP_POS instruction, if sign_bit of X(i) is equal to 0, (that means that X(i) is non-negative) and if X(i), furthermore, is not equal to 0 (then X(i) is a positive non- negative number) then the new value of number is given by bits 7 down to 0 of the instruction, and in the contrary case, the new value of number is just number + 1. This second VHDL model has also been synthesized and implemented, and those are the implementation results. Now, the circuit uses 72 flip flops instead of 200 and 261 look up tables; 229 look up tables are used to implement logic gates, and 32 look up tables are used to generate dual port RAMs. If we compare the implementation results of both circuits we observe that the second one needs much less flip flops (72 instead of 200), but it uses dual port RAMs. In fact, our register bank implementation was a very low level one, using 16 8-bit registers, that is to say 128 flip flops, plus an address decoder and 2 16-to-1 8-bit multiplexers. In the case of the second implementation, all those components are replaced by a 16 x 8 bit dual-port RAM. Hence, a normal question is: couldn't we improve our processor implementation using our method based on five separately defined blocks, but in such a way that the bank register is synthesized under the form of a dual port RAM? This is just a matter of changing the register bank description. In this case, we use this new description of the register bank, and then we let the synthesis tool automatically instantiate a dual port RAM. And here are the implementation results. This circuit is implemented with 72 flip flops (not 200) and it uses 135 look up tables. It is the best of the three considered implementations. Thus, in conclusion, there is a direct relation between the chosen VHDL description and the implementation results. Now a still more advanced method. High level synthesis tools (HLS tools) are now available. They permit to translate an initial specification description, for example in some programming language such as C, into an RTL VHDL specification. So, the designer is only responsible for this initial specification, and the rest of the design work is performed by automatic design tools. Here is an example of functional description of our processor that can be interpreted by a commercial high level synthesis tool called Vivado. The main part is a C function hls_processor, with the declaration of all the inputs and outputs; this is the declaration of the register bank; here the program counter; here a set of connections, and so on. Then the function description is very similar to the second VHDL description. It is a case construct, where numbers 0, 2 and so on correspond to instruction types. But observe that this is not an RTL description: there is no clock signal, and the high-level synthesis tool will be in charge of the operation schedule definition. Summary. Two alternative methods have been briefly described: the direct generation of a synthesizable VHDL model from an initial algorithmic specification, and (second alternative method) the automatic generation of a synthesizable VHDL model from an initial specification in some programming language, for example C or C++; it's the so-called "high level synthesis".