# Low Power Full Scan Architecture for UART Module

Abhinav S<sup>1</sup>, Kiran V<sup>2</sup>

<sup>1</sup>Student, Department of Electronics and Communication Engineering, R V College of Engineering, Mysore Road, RV Vidyaniketan, Post, Bengaluru, Karnataka, India-560059
<sup>2</sup>Associate Professor, Department of Electronics and Communication Engineering, R V College of Engineering, Mysore Road, RV Vidyaniketan, Post, Bengaluru, India-560059

Corresponding Author: Abhinav S

DOI: https://doi.org/10.52403/ijrr.20221106

#### ABSTRACT

Modern SoCs feature a complicated design made up of many macros that are in charge of various tasks carried out by an application. The effort required for the verification and testing of a specific product grows as a result of the requirement for more raw computing power and an increase in integration density. The testability of the communication modules is required because SoCs contain several communication modules that, if they fail, could render the SoC worthless. The full scan architecture for a fullduplex UART module is the one that is being suggested. The work displays the power savings for both manually and automatically inserted scan chains that were based on the system partition algorithms. While manually placed scan chains achieved peak power reduction of 74.9 percent in comparison to automatic scan chains, the automatically inserted scan chains have less area with higher activity.

*Keywords:* Low power, full scan, UART, DFT, System partition algorithms

#### **INTRODUCTION**

Higher integration densities and quicker computer chips became necessary at the dawn of the computer era and the early 21st century. The demand for verification and testing increased exponentially as the complexity of the circuits increased as a result of the advances in science and technology for the production of these integrated circuits. Lower yield and more faults were brought on by the deep submicron fabrication method. In order to handle larger production and longer field reliability of the chips, testing complexity increased. Obtaining 100% coverage is the goal of any chip testing procedure. To acquire a thorough understanding of a chip's testability, numerous measurements are used. Once a chip is testable, appropriate test logic is implemented. The scan chains are a subset of the DFT (Design for Test/Testability) architectures that allow for comprehensive coverage of a module or chip. The current SoC is a combination of several macros operating in various ways on a single chip. The high integration density provided the fabrication sector makes by it conceivable. By selecting from a number of IPs, each user can create a SoC based on their needs for a particular application. The challenge is determining how testable these IPs are within the SoC abstraction, how much DFT is necessary, and what possible architectures can be used. The full scan design appropriate since the is communication modules of a SoC may need great testability coverage, but the power consumption is also rather high. These DFT modules can be integrated into low power full scan or partial scan chain topologies, making the testing of modules energy efficient.

A flip-flop in a netlist is converted to a scan flip-flop using a control signal in the full scan architecture to alternate between the active operating mode and the test mode. The next accessible scan flip-flop is connected to the output of the scan flops. The measure of testability in the branching scenario recommends a suitable scan flip-flop. The outputs of the scan flip-flop ripple out of the scan flip-flops in a serial shifter style while the device is in test mode. The total number of clock cycles needed to test is dependent on the number of test vectors. An area overhead flip-flop is encountered during the conversion to a scan flip-flop. Eq. (1) and Eq. (2) below provide the formula to calculate the same (2).

$$Gate \ Overhead = \frac{4*NSFF}{Ngates + 10*NFF}$$
(1)

Scan Testing Time =  $(Ncomb + 2) \times NSFF + Ncomb + 4$  (2)

For a particular netlist, the gate overhead and total clock cycles needed to test are determined using the aforementioned Eq. (1) and Eq. (2). "NSFF" stands for "number of scan flip-flops," "NFF" for "number of flipflops in the netlist," and "Ncomb" for "number of combinational test patterns" available for testing.

## LITERATURE SURVEY

Utilizing the system partition for scan insertion in [1] decreases the test data storage by excluding some pointless segments claiming a power reduction of 43.3 percent. [2] proposes a safe scan architecture that keeps the scan dump debuggable while enhancing scan design security employing a skew-based lock and key. The suggested combines physical architecture the information into a lock-and-key system to create an imperceptible defense against the circuit's security attacker. The is strengthened and shielded from scan dump. A novel scan chain reordering technique based on care bit density is suggested in [3]. This suggested technique aids in merging care bits at the beginning of scan chains. A decrease in scan cell switching has been realized, ensuring scan shift operations. Test power consumption is decreased by the suggested scan chain reordering technique. The efficiency of decompression, the quantity of parallel inputs at LFSR, and scan chains are all discussed in [4]. By using simple exclusive or (XOR) gates, a state-skipping LFSR is developed, which reduces the amount of test patterns and delays through hardware.

Two innovative and effective Scan flipflop designs that use less power, space, and delay have been used in [5]. Cadence Virtuoso was used to develop the two initial Scan flip-flop designs, a modified Transmission Gate based Scan flip-flop and a Gate Diffusion Input based D flip-flop. In functional and test modes, a speed improvement was seen. The necessity of partial shift is described in [6], as well as how it reduces global switching activity and, thus, power consumption. It is shown that high heat dissipation increases clock skew, IR drop on clock buffers in the clock tree, and the chance of shift failures. In the proposed study, a novel approach to dealing with the problem is presented, one that decreases switching activities in clock tree paths by creating a clock-skew aware scan chain regrouping.

According to [7], the scan chain for full scan can be exploited as a back door to gain access to the crucial system components. As a result, each manufactured device has a physical unclonable function module sewn into the scan architecture for security. Since test patterns are not stored in memory, memory attacks are also impossible. Scan chains can be employed in [8] to access data in embedded systems' crypto-processors. As a result, this paper suggests that during test mode, control data and output data be encrypted. With a slight increase in area penalty, the performance should be relatively comparable to that of a straightforward test diagnostic.

A low power test pattern generator with test coverage of 98.81% and 97.35% for two alternative BIST architectures is suggested in [9]. SilTerra 0.13 m process uses 26.7 nW of power. According to the maximum length sequence, weighted clocks are utilized to allocate weights to the particular scan chains. The instability caused by many big scan flipflops triggering simultaneously is discussed in [10], and this work suggests a power displacement strategy by lowering toggling activity and triggering activity. Claims a power decrease of approximately 35%.

## **METHODOLOGY**

In order to save power, this work distinguishes between the advantages of manual scan chain insertion versus system partitioning-based scan chain allocation. The type of UART module designed determines how many scan chains are necessary. Here, the UART module facilitates full duplex connection at the highest baud rate. The Cadence Genus tool is used to synthesize the UART module. Contrary to power, process technology is not a major constraint. Depending on the power requirements, a suitable process technology process design kit (PDK) can be used. To enable scan chain insertion, the PDK needs scan flip-flop standard cells. Fig. 1 depicts the methodology process flowchart.



Flip-flop identification is done using the generated netlist and the synthesized register transfer level (RTL) code. The tool is used to take advantage of system partition techniques and find the pathways for scan chain insertion. The path for scan chain insertion is handled manually as well. By enabling the netlist synthesis that is best for physical design, the Tcl scripts are utilized to direct the tool to employ the system partition techniques. This makes it possible for the tool to create a netlist while preventing DRC violations, making it simpler to design and place with little congestion. This is merely a simple procedure to facilitate the flow of physical design. It may or may not facilitate much faster design convergence in the physical design flow. The scan chains are inserted and resynthesized for optimal area and performance. Every step of the synthesis process, the netlist is checked for DFT violations. The common DFT violation that is encountered is due to clock gating commands used for low power. The area overhead is measured as well as the increase in the power.

#### **RESULTS**

The Cadence Genus Tool is used to synthesis the RTL designs and insert scans into the designs. Figures 2 and 3 demonstrate how gate overhead and scan test time were estimated. For a certain number of gates, it is observed that the gate overhead grows exponentially as the quantity of flops increases. The quantity of flip-flops can be anything between one and half of the number of gates. Although it varies on the type of design and netlist, there are often less sequential elements. The amount of combinational test patterns and scan flipflops exhibit linear relationships with the scan test time. While the number of scan flipflops varies from a minimal value to half the estimated gates in the netlist, the number of combinational test patterns is fixed.

The ability to test practically all of the nets in the obtained netlist is a benefit of full scan architecture, as was previously mentioned. Area, power, and possibly longer design convergence times during optimal scan chain stitching to the netlist are all costs associated with this. The majority of flip-flops in today's SoCs will be transformed to scan flops, which would cause timing convergence challenges and perhaps result in а considerable rise in power consumption. This would again need the employment of low power approaches to reduce switching activity. As a result, there are many variables and a higher dimensional optimization issue. Therefore, high sensitivity modules may use full scan architecture. If compared to manual scan chain insertion, the power reduction seen in system partition-based scan chain insertion may or may not be less.



**Table I Performance of Scan Chain Insertion** 

| Parameters              | Without scan chains | Manual scan chain insertion |      | System partition scan insertion |
|-------------------------|---------------------|-----------------------------|------|---------------------------------|
| Gates                   | 173                 | 173                         | 173  | 173                             |
| Area (µm <sup>2</sup> ) | 839                 | 857                         | 858  | 856                             |
| Power (µW)              | 13.34               | 16.424                      | 21.5 | 21.62                           |
| Number of scan chains   | 0                   | 3                           | 4    | 2                               |

The scan chains that were introduced into the UART module yielded the findings shown in Table I. General-purpose Design Kit

(GPDK) 45nm process technology is employed. Although the four scan chains distribution led to similar performance statistics of scan chain insertion based on system partition algorithms, it can be seen that manual scan chain insertion typically led to lower activity and consequently low power consumption for the three scan chains distribution. The CAD tools are far too complicated and can converge the design with barely perceptible alterations. While the area overhead between a netlist with and without DFT is apparent, the area variation for several scan chains is rather negligible. The power consumption of scan chains inserted manually and those generated by tool convergence differs noticeably. A bigger system like a SoC may experience power performance issues as a result. But the tradeoff is the amount of pins required to test, cost of the test equipment and area. Additionally, developing manual scan chain insertion for many modules with various speed, area, and power constraints takes time.

## CONCLUSION

High density integrated circuit testing is becoming increasingly necessary. High coverage, low area overhead, and better power and speed performance are required for DFT. The complete scan chain architecture is a DFT architecture that trades off area, speed, and power for great coverage. Full scan chains are only permitted on smaller, very sensitive modules, such as communication modules. Defects in these modules may render a good chip useless. Other, equally potent DFT structures that enable design tests with high confidence include LBIST and MBIST. The quantity of DFT placed in the netlist can be decreased at the expense of power and speed performance by using tools with very high convergence algorithms to insert and stitch scan chains. Tools must be supplied with a wider range of optimization strategies in order to converge on a high-quality design with the fewest trade-offs.

## Conflict of Interest: None

### REFERENCES

- H. Kim, H. Oh, S. Lee and S. Kang, "Low Power Scan Chain Architecture Based on Circuit Topology," 2018 International SoC Design Conference (ISOCC), 2018.
- 2. H. Woo, S. Jang and S. Kang, "A Secure Scan Architecture Protecting Scan Test and Scan Dump Using Skew-Based Lock and Key," in IEEE Access, 2021.
- 3. Kyunghwan Cho, Jihye Kim, Hyunggoy Oh, Sangjun Lee, and Sungho Kang "A New Scan Chain Reordering Method for Low Power Consumption based on Care Bit Density" in IEEE International SoC Design Conference (ISOCC) 2019.
- 4. D Manasa Manikya, Marala Jagruthi1, Rana Anjum, Ashok Kumar K, "Design of Test Compression for Multiple Scan Chains Circuits" International Conference on System, Computation, Automation and Networking (ICSCAN), 2021.
- Nagesh B, Nikhil Chandra B S, "Design of Efficient Scan Flip-Flop", 6th International Conference on RecentTrends on Electronics, Information, Communication & Technology (RTEICT), 2021.
- Yucong Zhang, Xiaoqing Wen, Stefan Hols, Kohei Miyase, Seiji Kajihara, Hans-Joachim Wunderlich, and Jun Qian, "Clock-Skew-Aware Scan Chain Grouping for Mitigating Shift Timing Failures in Low-Power Scan Testing ", IEEE 27th Asian Test Symposium, 2018.
- K. -J. Lee, C. -A. Liu and C. -C. Wu, "A Dynamic-Key Based Secure Scan Architecture for Manufacturing and In-Field IC Testing," in IEEE Transactions on Emerging Topics in Computing, vol. 10, no. 1, pp. 373-385, 1 Jan.-March, 2022.
- 8. Yucong Zhang, Xiaoqing Wen, Stefan Hols, Kohei Miyase, Seiji Kajihara, Hans-Joachim Wunderlich, and Jun Qian, "Clock-Skew-Aware Scan Chain Grouping for Mitigating Shift Timing Failures in Low-Power Scan Testing", IEEE 27th Asian Test Symposium, 2018.
- V. Shivakumar, C. Senthilpari and Z. Yusoff, "A Low-Power and Area-Efficient Design of a Weighted Pseudorandom Test-Pattern Generator for a Test-Per-Scan Built-in Self-Test Architecture," in IEEE Access, vol. 9, pp. 29366-29379, 2021.

 J. C. Rau and J. -X. Wang, "A Scan-Based Lower-Power Testing Architecture for Modern Circuits," 2021 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2021.

How to cite this article: Abhinav S, Kiran V. Low power full scan architecture for UART module. *International Journal of Research and Review*. 2022; 9(11): 40-45. DOI: *https://doi.org/10.52403/ijrr.20221106* 

\*\*\*\*\*