# Design and implementation 4x4 Network on Chip (NoC) using FPGA

Waleed K. Al-Azzwai <sup>1</sup>, Aqeel A. Al-Hilali <sup>1</sup>, Laith F. Jumma<sup>2</sup>

<sup>1</sup>Department of Medical Instrumentation, College of Medical Techniques, Al-Farahidi University, Iraq <sup>2</sup>Department of Medical Instrumentation, Al-Esraa University College, Iraq

## **ABSTRACT**

Network on Chip is one of the most crucial in the development of the networks and routers in new days due to the improvement that make through way of sending huge packets of data and technology and protocol that used for sending the packets between IP cores. Also, the NoC is removed the old ways of technique and protocols of sending packets through routers. The NoC is the most enhanced and efficient in working in parallel ways. Hence, Field Program Gate Array (FPGA) is used in the designing of router that its architecture depending on the NoC. The research proposed in this paper proposed 4x4 nodes router with mesh network that support two-dimensional architecture. A real time test is used for experimental through FPGA and the test verified efficient packets transporting via two array buffers that integrated in the proposed router architecture.

**Keywords**: NoC; Network on Chip; FPGA; Router; SoC.

## Corresponding Author:

Aqeel A. Al-Hilali Department of Medical Instrumentation, College of Medical Techniques Al-Farahidi University Baghdad, Iraq

E-mail: akeel.alhilali@alfarahidiuc.edu.iq

# 1. Introduction

The Network-on-Chip has got a lot of care and attention in the field of the communication design in recent years due to the capacities and facilities and its owed ability to provide increased bandwidth and performance due to its architectures in compared to the traditional ways of communication, also the NoC supported the architectures for on-chip interaction. NoC will provide frameworks that are simple and scalable [1]. A Novel Network on Chip design paradigm is used for the system on chip (SoC) architecture [2]. SOC system is a thought for on chip circuit with the goal of increasing the demand for parallel processing by enhanced the performance and productivity of a chip design. As a result, chip multiprocessing (CMP) system has undergone a revolution [3]. Instead, NoC connectivity has developed as a significant on-chip communication option. The NoC design solves bus-based layout issues and delivers higher scalability, efficiency, and fault tolerance features by utilizing associated routing components [4]. The NoC architecture is made up of three parts: network interfaces (NI), the computing portion and Router [5]. Routers are intelligent devices that analyses arriving data packets to identify the optimal path for the data to go from source to destination. The router is optimized for performance [6]. In a 2D mesh, each router has five directional channels. There are five of them: east, west, south, north, and the local canal [7]. NoC topology is classified into two types: regular and irregular. Irregular Topologies typically accomplish specific duties by mixing two types of normal. The second kind is standard topologies. Most NoCs employ mesh or torus because to their simplicity and uniform design [8] [9]. Chip Multi-Processors (CMPs) have been created to tackle design issues in order to satisfy the rising processing needs. CMPs may have hundreds of interconnected Intellectual Property (IP) cores, processing units, and embedded memory blocks [10]. So far, several academics have been inspired by the need for NoC-based designs as a means of solving complicated design issues. CMPs may have hundreds of interconnected Intellectual Property (IP) cores, processing units, and embedded memory blocks [10]. So far, several academics have been inspired by the need for NoC-based designs as a means of solving complicated design issues. The authors [11] are offering a



simulation technique for large number of similar and mixed networks on chips that integrated in a single FPGA. The system and FPGA were used to model and mimic the NoC. This approach enables the observation of NoC actions for a wide variety of traffic trends. D. Borrione and colleagues [11] create a standardized network on chip (GeNoC) as a functional model for interconnection networks. R. Sunkam provided two distinct ways to optimizing NOC throughput by load balancing traffic across all lines. First, an unique adaptive routing algorithm for mesh and torus topologies is suggested. For reconfigurable router design, M. Mathew and D. Mugilan [13] proposed adopting a heterogeneous router architecture. The suggested router may change the buffer length dynamically. Multiplexers are used to reduce energy usage. However, this necessitates the support of large regions as well as hardware. The authors [14] present a NOC-based router in which the virtual channelling buffer method plays a critical role in network mesh performance. Romanov, A. Yu. [15] For the network, it is recommended to adopt a topology consisting of two-dimensional circulation on chip architecture. The authors in paper [17] proposed an application of contemporary topologies in the concept of networks-on-chip to employ two-dimensional circulation topologies for network on chip design, and an optimal routing algorithm is being developed. Microstrip Filter and antenna designs can be associated with FPGA to enhance the performance of routing systems within specified frequency bands [18-20].

# 2. Network architecture

A network-on-chip is a multi-processor system on a chip (MPSoC) having contains cores which capable of communicate through interconnection fabric. Figure 1 depicts the simplified architecture of NoC. It's set up as a 4x4 2-D mesh network. The system consists of a network of nodes that must transmit and receive data in order to accomplish their tasks. IP can be of numerous types, with the master acting as the CPU and the slave acting as the memory. The network switch is considered one part of the processor element (PE), while the other part of the PE is the interface (NI), and the last part of it is the router. These three parts of the components considered as node (R). According to the IP use, the master and slave node considered the two classifications of the Network Nodes.



Figure 1. The design proposed for Network on Chip

Figure 2 illustrates this. The master node is described as consisting of Na Master, IP Master, and Router, with the router being the same kind for all master and slave nodes. There are five directions on the router and these directions are the north-pole while the other direction is south-pole, and the west, and east direction and the last is the local direction. The packet is disassembled and reassembled by the NA function, which is coupled to the local direction. IP sends signals to NA and cand describe as address which have 32 data width (A0-A31), while the second signal is the write enable for enables the wiring prosses from IP core to the NA/Slave. The third signal will pass the data that needed to be written in the NA/Slave through the Write data. The last signal is the

read request which the IP request when required to read data from NA/Slave. The Ip gets signals from them (read return and read data).



Figure 2. Master node design

Figure 3 depicts the slave node, which consists of a Na slave, an IP slave, and a Router. IP sends signals to NA/Slave via three signals which are they slave not ready signal which indicate that the slave node is not ready to accept or to read data from it. While the second the signal is read return which return the data to the NA/Slave and the last signal is read data which IP read the data from the NA/Slave node via this signal. While the IP core gets signals from them NA/Slave via the first signal which specified the address and the location that the data to be read and this signal called address signal. The second signal will enable the IP core buffer to be node by activate the second signal write enable and when Write enable active the last but not least signal will enable the NA/Slave from request to read data.



Figure 3. Slave node design

The packet structure is seen in Figure 4. The master node is in charge of sending request packets in two mode which are they write and read modes. while the slave node is in charge of sending return packets via the mode of Read return which send it via read return signal. The method that was employed in this paper depending on X and Y method for deterministic packet transmission algorithm. This simple and efficient approach, that shown in figure 4 shows that the bit number 48 in all three packets used to control direction of data in the Y-axis if its north so it cand send the packet to the north or if its south so to send the packet to the south according to the bit 48 in the three packets. The next two bits which are 47-46 bits is used as counter for the Y-direction. While the bit number 45 is used 45 represents the direction of sending the packet depending on the value of this bit will send the data either west or east way. While the followed two bits 44-43 that flowed bit number 45 is used as counter for the X axes. As shown in figure 4 that all three packets contain first 6 bits in the MSB for the direction of the packet to be send while the bits that number 42, 41 and 41 is used to distinguish the type of the packet if

it Request read by make bits 42 once and the 41 and 40 contains zero, also if the packet is a write the bit 41 will be on while the two other bits zeros and the last situation read return if the 40 is on and the other is zero. objective ID is determined and constitute through 39-36 bits that located in the packets, whereas the bits that remain is used for various purposes. The remaining bits indicate read or write data, as well as supplementary information such as local contact information. Which represents packets and the contents of the memory location, and occasionally zeros to keep things simple and also to keep the specified packets length.



Figure 4. Packet structure

## 3. Results

The router contains buffers in both the input and output, and we proposed an extra set of buffer arbitration before passing packets via the router, which decreases packet competition. Figure 5 depicts two packets, the first of which is traveling from north-direction to east-direction and the second of which is traveling from south-direction to east-direction. At first the packet is sent in an easterly direction, while the second packet is stored in a buffer. When more than one packet is attempted to travel in the same direction, the router holds the next transmission until the previous packet has passed. It is critical to keep an eye on the connection mechanism through the crossbar to ensure that it is operating in accordance with the preset conditions and a grant issued by the arbitrator. The crossbar relation process is driven by a round-robin arbitrator.



Figure 5. Router packet from south to west

Figure 6 shows the network adaptor master in a write instance. Where Master-NI packetizes the received signals from master PE and sends it to the router in the case of write and read requests, we note the activation of one bit for the flag of writing.



Figure 6. NA Master

Figure 7 illustrates this. In a read return situation, a network adapter slave is used. The activation of one bit for the read return flag is seen where the packets appear after assembly in the read return process. In the read request packet, the source contact serves as the read return packet's purpose.



Figure 7. NA Slave in Read return case

The IP master symbolizes the processor's core, where data is processed before being written to slaves in the writing process. With each writing operation, the counter counts a 32 loop. The IP slave serves as a storage device for writing data. When the IP slave gets the read request, this data is transmitted in the read return procedure. Figure 9 illustrates this. The data is saved in the zero-memory location.



Figure 8. IP Master



Figure 9. IP Slave

Figure 10 depicts the initial network connection between master0 and slave4. After being processed in master0, the data is no longer in the local direction. The data flows from routerZero in the north to routerFour, where it arrives from the resident direction in slaveFour, which represents the memory, in the south.



Figure 10. Master0\_Slave4



Figure 11. Master1\_Slave12

Figure 11 shows how routerFive, routerNine, and routerThirteen in the network deliver data from master1 to slave12. Figure 12 depicts master2 sending data to slave15 via routers 6, 10, and 14 in the network. As seen in Figure 13, master3 in the network transmits data to slave7.



Figure 12. Master2\_Slave15



Figure 13. Master3 Slave7

#### 4. Conclusion

We have successfully modelled, developed, and tested a simulated 4x4 2 D Mesh Network-on-Chip in this article, while keeping the suggested network efficient and less complicated. The benefit of adding another buffer helps for handle the congestion of the packets in the network and all this done by adding the second buffer to the network. The paper shows great results through capability of the work in dealing with congestion packets and the speed of the sending and receiving packets which make it through integrated of NoC on FPGA system. The features work of this is to make special router that capable huge data with minimum latency and tested and to increase the number of nods inside the router.

## Acknowledgements

The authors would like to thank the support of Al-Farahidi University, especially College of Medical Techniques.

# **Declaration of competing interest**

The authors declare that they have no any known financial or non-financial competing interests in any material discussed in this paper.

# **Funding information**

The authors declare that they have no funding received from any financial organization to conduct this research

## References

- [1] N. Abeyratne *et al.*, "Scaling towards kilo-core processors with asymmetric high-radix topologies," in 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), 2013.
- [2] M. Besta, S. M. Hassan, S. Yalamanchili, R. Ausavarungnirun, O. Mutlu, and T. Hoefler, "Slim NoC: A low-diameter on-chip network topology for high energy efficiency and scalability," *arXiv* [cs.AR], 2020.
- [3] D. Bhattacharya and N. K. Jha, "Analytical modeling of the SMART NoC," *IEEE trans. multi-scale comput. syst.*, vol. 3, no. 4, pp. 242–254, 2017.

- [4] L. Chen, R. Wang, and T. M. Pinkston, "Critical bubble scheme: An efficient implementation of globally aware network flow control," in 2011 IEEE International Parallel & Distributed Processing Symposium, 2011.
- [5] X. Chen and N. K. Jha, "Reducing wire and energy overheads of the SMART NoC using a setup request network," *IEEE Trans. Very Large Scale Integr. VLSI Syst.*, vol. 24, no. 10, pp. 3013–3026, 2016.
- [6] W. Dally and B. Towles, *Principles and Practices of Interconnection Networks*. Morgan Kaufmann Inc, 2003.
- [7] B. K. Daya *et al.*, "SCORPIO: A 36-core research chip demonstrating snoopy coherence on a scalable mesh NoC with in-network ordering," in 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), 2014.
- [8] G. Dimitrakopoulos, A. Psarras, and I. Seitanidis, *Microarchitecture of Network-on-Chip Routers*. New York: Springer, 2015.
- [9] J. Duato, "A new theory of deadlock-free adaptive routing in wormhole networks," *IEEE Trans. Parallel Distrib. Syst.*, vol. 4, no. 12, pp. 1320–1331, 1993.
- [10] K. Duraisamy and P. P. Pande, "Enabling high-performance SMART NoC architectures using on-chip wireless links," *IEEE Trans. Very Large Scale Integr. VLSI Syst.*, vol. 25, no. 12, pp. 3495–3508, 2017.
- [11] P. T. Wolkotte, P. K. F. Holzenspies, and G. J. M. Smit, "Fast, accurate and detailed NoC simulations," in *First International Symposium on Networks-on-Chip (NOCS'07)*, 2007.
- [12] R. Ramanujam, Throughput-driven Design of Networks-on-chip. San Diego, 2011.
- [13] M. Mathew and D. Mugilan, "Reconfigurable router design for Network-on-Chip," in 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014], 2014.
- [14] L. Nazir and R. N. Mir, "Evaluation of efficient elastic buffers for network on chip router," in 2017 IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI), 2017.
- [15] A. Y. Romanov, E. V. Lezhnev, A. Y. Glukhikh, and A. A. Amerikanov, "Development of routing algorithms in networks-on-chip based on two-dimensional optimal circulant topologies," *Heliyon*, vol. 6, no. 1, p. e03183, 2020.
- [16] C.-H. Chao, K.-Y. Jheng, H.-Y. Wang, J.-C. Wu, and A.-Y. Wu, "Traffic- and thermal-aware run-time thermal management scheme for 3D NoC systems," in 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip, 2010.
- [17] K. Lahiri, A. Raghunathan, and S. Dey, "Efficient exploration of the SoC communication architecture design space," in *IEEE/ACM International Conference on Computer Aided Design. ICCAD 2000. IEEE/ACM Digest of Technical Papers (Cat. No.00CH37140)*, 2002
- [18] Y. S. Mezaal and J. K. Ali, "Investigation of dual-mode microstrip bandpass filter based on SIR technique," PLoS One, vol. 11, no. 10, p. e0164916, 2016.
- [19] Y. S. Mezaal, H. T. Eyyuboglu, and J. K. Ali, "A novel design of two loosely coupled bandpass filters based on Hilbert-zz resonator with higher harmonic suppression," in 2013 Third International Conference on Advanced Computing and Communication Technologies (ACCT), 2013.
- [20] S. A. Shandal, Y. S. Mezaal, M. F. Mosleh, and M. A. Kadim, "Miniaturized wideband microstrip antenna for recent wireless applications," Adv. electromagn., vol. 7, no. 5, pp. 7–13, 2018.