# Models and Algorithms of Soft Error Inducing Mechanisms in ICs

by

## Pelopidas Tsoumanis

Submitted to the Department of Electrical and Computer Engineering in partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Electrical and Computer Engineering

at the

UNIVERSITY OF THESSALY

February 2021

© Pelopidas Tsoumanis 2021. All rights reserved.



| Author                                            |
|---------------------------------------------------|
| Pelopidas Tsoumanis                               |
| Department of Electrical and Computer Engineering |
| February 24, 2021                                 |
| Certified by                                      |
| George Stamoulis                                  |
| Professor                                         |
| Thesis Supervisor                                 |
| Accepted by                                       |
| Christos D. Antonopoulos                          |
| Associate Professor                               |
| Chairman, Department Committee on Graduate Theses |

# Models and Algorithms of Soft Error Inducing Mechanisms in

by

ICs

### Pelopidas Tsoumanis

Submitted to the Department of Electrical and Computer Engineering on February 24, 2021, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical and Computer Engineering

#### Abstract

Reliability of Integrated Circuits (ICs) has always been one of the primary concerns in VLSI field, let alone nowadays, when the continuous technology shrinking that follows the Moore's Law renders them more susceptible to various factors. Recently, the reliability challenge of radiation-induced Soft Errors has drawn remarkable attention of many researchers. A particle of sufficient energy that strikes a transistor may create a disturbance of the equilibrium between the electron and holes within the device and result in changing the logic state of the gate output that usually lasts for some picoseconds. This temporary phenomenon that emerges as a glitch, called Single Event Transient (SET), at the output pulse may affect the proper operation of the circuit as it propagates, and eventually get captured by a storage element. This is called a Soft Error and, although it is not permanent, it may cause unexpected behavior to the circuit. Therefore, it is important for the industry to be acquainted with the circuit susceptibility to such kind of errors, especially when it comes to critical systems, such as medical, avionics, military, etc.

In this dissertation, we present an integrated framework for the modeling of the radiation-induced Soft Errors in the combinational logic of ICs, providing an evaluation of the Soft Error Rate (SER), a widely-used metric for the susceptibility of ICs to such hazards. The proposed methodology, which is based on Monte Carlo simulations, takes into consideration the physical layout of a circuit to deal with the Single Event Multiple Transients (SEMTs), which have become more prevalent with the continuous technology downscaling. Two variations of the main SER estimation algorithm, for the identification of the most susceptible areas of a circuit and the most vulnerable gates, are also presented, allowing for slight modifications of either the circuit design, or the placement strategies in the early design stages, or both to mitigate SER. The SER evaluation results are demonstrated by a variety of simulations performed on ISCAS '89 benchmark suite for both 45nm and 15nm technologies and their verification with HSPICE simulation tool indicates an acceptable deviation for the small-scale circuits. Finally, the behavior of the state-of-the-art Fully-Depleted Silicon-On-Insulator (FDSOI) process technology concerning the ion-

izing particle strikes is examined and compared with the traditional Bulk technology through TCAD simulations.

Thesis Supervisor: George Stamoulis

Title: Professor

## Μοντέλα και Αλγόριθμοι Προσομοίωσης Μηχανισμών Πρόκλησης Μεταβατικών Σφαλμάτων σε Ολοκληρωμένα Κυκλώματα

του

## Πελοπίδα Τσουμάνη

Υποβλήθηκε στο Τμήμα Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών στις 24 Φεβρουαρίου 2021, ως μερική εκπλήρωση των απαιτήσεων για τον Διδακτορικό Τίτλο Ηλεκτρολόγου Μηχανικού και Μηχανικού Υπολογιστών

## Περίληψη

Η αξιοπιστία των Ολοκληρωμένων Κυκλωμάτων (ΟΚ) ήταν ανέκαθεν μία από τις πιο συχνές ανησυχίες στον τομέα του VLSI, πόσο μάλλον σήμερα, που η συνεχής συρρίχνωση της τεχνολογίας που αχολουθεί το Νόμο του Moore τα χαθιστά πιο ευάλωτα σε διάφορους παράγοντες. Η πρόκληση που έχει να κάνει με την αξιοπιστία και τα Μεταβατικά Σφάλματα που προκαλούνται από ιονίζουσα ακτινοβολία κερδίζει συνεχώς έδαφος και έχει τραβήξει την προσοχή αρκετών ερευνητών προσφάτως. Ένα σωματίδιο επαρχούς ενέργειας που συγχρούεται με ένα τρανζίστορ μπορεί να προχαλέσει διαταραχή στην ισορροπία μεταξύ ηλεκτρονίων και οπών εντός της συσκευής, έχοντας ως αποτέλεσμα την αλλαγή της λογικής κατάστασης της εξόδου της πύλης που διαρχεί συνήθως ορισμένα picoseconds. Το προσωρινό αυτό φαινόμενο που εμφανίζεται ως ένας ανεπιθύμητος παλμός στην έξοδο, που καλείται και Single Event Transient (SET), μπορεί να επηρεάσει τη σωστή λειτουργία του κυκλώματος καθώς διαδίδεται και εν τέλει αποθηκεύεται σε κάποιο στοιχείο μνήμης. Αυτό ονομάζεται Μεταβατικό Σφάλμα και παρόλο που δεν είναι μόνιμο μπορεί να προκαλέσει απρόσμενη συμπεριφορά στο κύκλωμα. Έτσι, είναι σημαντικό για τη βιομηχανία να γνωρίζει την ευπάθεια των χυχλωμάτων σε τέτοιου είδους χινδύνους, ειδιχότερα όταν αυτά αφορούν χρίσιμα συστήματα, όπως ιατρικά, αεροηλεκτρονικά, στρατιωτικά, κ.ά.

Σε αυτήν τη διατριβή παρουσιάζουμε ένα ολοκληρωμένο πλαίσιο για τη μοντελοποίηση των Μεταβατικών Σφαλμάτων εξαιτίας της ιονίζουσας ακτινοβολίας στη συνδυαστική λογική των ΟΚ, παρέχοντας μία εκτίμηση της Συχνότητας Μεταβατικών Σφαλμάτων (ΣΜΣ), ενός ευρέως διαδεδομένου μέτρου για την ευπάθεια των ΟΚ σε τέτοιους κινδύνους. Η προτεινόμενη μεθοδολογία που βασίζεται σε Monte Carlo προσομοιώσεις, λαμβάνει υπόψην το φυσικό σχέδιο ενός κυκλώματος ούτως ώστε να μοντελοποιήσει τα Single Event Multiple Transients (SEMTs), τα οποία γίνονται ολοένα και πιο συχνά με τη συρρίκνωση της τεχνολογίας. Επίσης, παρουσιάζονται δύο παραλλαγές του κύριου αλγορίθμου της εκτίμησης της ΣΜΣ, για την ταυτοποίηση των ευαίσθητων περιοχών και των πιο ευαίσθητων πυλών ενός κυκλώματος, επιτρέποντας μικρές τροποποιή-

σεις είτε της σχεδίασης του χυχλώματος, είτε αχόμη και των στρατηγικών χωροθέτησής του στα πρώτα στάδια της σχεδίασης με σχοπό τη μείωση της ΣΜΣ. Τα αποτελέσματα της εχτίμησης της ΣΜΣ επιδειχνύονται με μία ποιχιλία προσομοιώσεων που πραγματοποιούνται πάνω στα ISCAS '89 χυχλώματα για τις τεχνολογίες των 45nm και 15nm, ενώ η επαλήθευση με το εργαλείο προσομοίωσης HSPICE υποδειχνύει αποδεχτή απόχλιση για τα μιχρής χλίμαχας χυχλώματα. Τέλος, εξετάζεται η συμπεριφορά της σύγχρονης τεχνολογίας κατασχευής ΟΚ Fully-Depleted Silicon-On-Insulator (FD-SOI) απέναντι στα χτυπήματα των σωματιδίων και συγχρίνεται με τη συμβατιχή Bulk τεχνολογία μέσω προσομοιώσεων TCAD.

Επιβλέπων Διατριβής: Γεώργιος Σταμούλης

Τίτλος: Καθηγητής

## Acknowledgments

As the curtain falls on this long, painstaking, but at the same time exciting journey of my doctoral studies, I would like to express my sincere appreciation and my deepest gratitude to some people that without their support and help, it would have been impossible for me to bring this dissertation to fruition.

First, I would like to thank my advisor, Professor George Stamoulis, for supervising my research progress, as well as for his invaluable guidance and constant support throughout my doctoral studies, contributing to my academic self-growth with his technical expertise. I would also like to thank my thesis committee members, Professor Nestor Evmorfopoulos and Professor Fotis Plessas, for assisting me to improve and refine this work with their insightful feedback. In addition, I would like to express my appreciation to my thesis examination committee members, Professor Michael Dossis, Professor Georgios Dimitriou, Professor Antonios Dadaliaris and Professor Spyridon Nikolaidis, for examining and accepting this dissertation, as well as for providing useful technical feedback. I would like to acknowledge all my workmates and friends in the Electronics Lab of the Department for the harmonious and constructive cooperation and, of course, the technical and secretarial support of the Department for facilitating my work throughout my studies.

Last but not least, I would like to thank from the bottom of my heart my family for the unreserved and wholehearted support. My deepest gratitude to my parents for the unconditional support, encouragement and for feeling empathy with my concerns. Without their emotional and financial generosity, this beautiful adventure would have not been crowned with success. Also, special thanks to all my friends for the priceless and unforgettable moments that we have shared all these years helping me to rest my mind outside of my research.



This doctoral thesis has been examined by a Committee of the Department of Electrical and Computer Engineering as follows:

| Professor George Stamoulis                                                                        |
|---------------------------------------------------------------------------------------------------|
| Chairman, Thesis Supervisor Professor of Electrical and Computer Engineerin University of Thessal |
| Professor Nestor Evmorfopoulos                                                                    |
| Professor Fotis Plessas                                                                           |
| Professor Michael Dossis                                                                          |
| Professor Georgios Dimitriou                                                                      |
| Professor Antonios Dadaliaris                                                                     |
| Professor Spyridon Nikolaidis                                                                     |



# Contents

| A            | bstra | .ct    |                                    | $\mathbf{v}$ |
|--------------|-------|--------|------------------------------------|--------------|
| $\mathbf{G}$ | reek  | Abstr  | ract                               | vii          |
| Li           | st of | Figur  | :es                                | xix          |
| Li           | st of | Table  | es es                              | xxiii        |
| Li           | st of | Abbro  | reviations                         | xxv          |
| 1            | Intr  | oduct  | ion                                | 1            |
|              | 1.1   | Motiv  | vation                             | . 1          |
|              | 1.2   | Autho  | or Contribution                    | . 2          |
|              |       | 1.2.1  | Contribution                       | . 2          |
|              |       | 1.2.2  | Publications                       | . 3          |
|              | 1.3   | Outlin | ne                                 | . 5          |
| 2            | Bac   | kgrou  | $\mathbf{nd}$                      | 7            |
|              | 2.1   | Main   | Causes of Soft Errors              | . 7          |
|              |       | 2.1.1  | Alpha particles                    | . 8          |
|              |       | 2.1.2  | Cosmic radiation                   | . 8          |
|              |       | 2.1.3  | Other sources                      | . 10         |
|              | 2.2   | Soft E | Errors on Silicon                  | . 11         |
|              |       | 2.2.1  | Radiation interaction with silicon | . 11         |
|              |       | 2.2.2  | Linear Energy Transfer             | . 13         |

|   |                          | 2.2.3  | Modeling of the ionizing particle strike                        | 14 |  |  |
|---|--------------------------|--------|-----------------------------------------------------------------|----|--|--|
|   | 2.3                      | Soft E | rror Rate Mitigation                                            | 18 |  |  |
| 3 | Rela                     | ated W | Vork                                                            | 21 |  |  |
| 4 | Soft                     | Error  | Rate Estimation Framework                                       | 27 |  |  |
|   | 4.1                      | Maskii | ng Mechanisms                                                   | 27 |  |  |
|   |                          | 4.1.1  | Logical masking                                                 | 28 |  |  |
|   |                          | 4.1.2  | Electrical masking                                              | 28 |  |  |
|   |                          | 4.1.3  | Timing masking                                                  | 29 |  |  |
|   | 4.2                      | Single | Event Transients                                                | 30 |  |  |
|   |                          | 4.2.1  | Masking mechanisms modeling                                     | 30 |  |  |
|   |                          | 4.2.2  | Reconvergent transient pulses                                   | 35 |  |  |
|   |                          | 4.2.3  | Timing issues                                                   | 36 |  |  |
|   |                          | 4.2.4  | Failure probability calculation                                 | 39 |  |  |
|   |                          | 4.2.5  | Gate sensitivity evaluation                                     | 41 |  |  |
|   | 4.3                      | Single | Event Multiple Transients                                       | 42 |  |  |
|   |                          | 4.3.1  | Sensitive regions                                               | 43 |  |  |
|   |                          | 4.3.2  | Multiple site identification                                    | 45 |  |  |
|   | 4.4                      | Overal | ll Soft Error Rate Estimation Flow                              | 49 |  |  |
| 5 | Experimental Results 55  |        |                                                                 |    |  |  |
|   | 5.1                      | Soft E | rror Rate Verification                                          | 55 |  |  |
|   | 5.2 Experimental Results |        |                                                                 |    |  |  |
|   |                          | 5.2.1  | Experimental setup                                              | 59 |  |  |
|   |                          | 5.2.2  | Impact of the electrical and timing masking modeling on the SER | 59 |  |  |
|   |                          | 5.2.3  | Impact of the masking mechanisms on SET propagation             | 61 |  |  |
|   |                          | 5.2.4  | Evaluation of the effect of SET and SEMT consideration on the   |    |  |  |
|   |                          |        | SER                                                             | 62 |  |  |
|   |                          | 5.2.5  | Circuit site and gate vulnerability evaluation                  | 66 |  |  |
|   |                          | 526    | Temperature dependence of the SER                               | 68 |  |  |

|              |                                                              | 5.2.7  | Comparison among similar SER estimation approaches | 69         |  |  |
|--------------|--------------------------------------------------------------|--------|----------------------------------------------------|------------|--|--|
|              | 5.3                                                          | Speed- | up of SER Estimation                               | 71         |  |  |
| 6            | Soft                                                         | Error  | s in FDSOI Technology                              | <b>7</b> 5 |  |  |
|              | 6.1                                                          | About  | Silicon On Insulator Technology                    | 75         |  |  |
|              | 6.2                                                          | Heavy  | Ion Strike TCAD Characterization                   | 77         |  |  |
|              |                                                              | 6.2.1  | About TCAD simulations                             | 77         |  |  |
|              |                                                              | 6.2.2  | Simulation setup                                   | 78         |  |  |
|              |                                                              | 6.2.3  | NMOS and PMOS device simulations                   | 80         |  |  |
|              |                                                              | 6.2.4  | Mixed-mode CMOS Inverter simulation                | 82         |  |  |
| 7            | Conclusions and Future Work                                  |        |                                                    |            |  |  |
|              | 7.1                                                          | Conclu | ısion                                              | 85         |  |  |
|              | 7.2                                                          | Future | Work                                               | 86         |  |  |
| $\mathbf{A}$ | Overall SER Results of ISCAS '89 Benchmarks on 45nm and 15nm |        |                                                    |            |  |  |
|              | Tecl                                                         | nnolog | ies                                                | 89         |  |  |
| Bi           | Bibliography 9                                               |        |                                                    |            |  |  |



# List of Figures

| 1-1 | CMOS technology generation trend of (a) critical charge $(Q_{crit})$ and |    |
|-----|--------------------------------------------------------------------------|----|
|     | (b) Soft Error Rate (SER) of SRAM cells, latches, and combinational      |    |
|     | logic                                                                    | 2  |
| 2-1 | Representation of various particle cascades as high-energy cosmic rays   |    |
|     | interact with earth's atmosphere                                         | 9  |
| 2-2 | Charge generation and collection due to a particle strike on a reverse-  |    |
|     | biased p-n junction. (a) Generation of electron-hole pairs along the ion |    |
|     | traversal, (b) quick drift charge collection and extension of the deple- |    |
|     | tion region towards the substrate forming a funnel, (c) slow diffusion   |    |
|     | charge collection, and (d) the generated current pulse on the transistor |    |
|     | node throughout this process                                             | 12 |
| 2-3 | Energy loss for alpha particles of 5.49 MeV moving through air           | 14 |
| 2-4 | Generated double exponential current pulse                               | 15 |
| 2-5 | Modeling of the radiation particle strike with a current source when     |    |
|     | the strike occurs on (a) nMOS transistor and (b) pMOS transistor of      |    |
|     | an Inverter.                                                             | 16 |
| 2-6 | Particle striking the off pMOS of a CMOS Inverter                        | 17 |
| 4-1 | A SET resulted from an ionizing particle strike on a NAND2 gate          |    |
|     | is subjected to the three masking mechanisms, i.e., (a) logical, (b)     |    |
|     | electrical, and (c) timing, as it propagates through the circuit         | 28 |
|     |                                                                          |    |

| 4-2 | Output pulses for (a) same direction, (b) different direction, and (c) non- |    |
|-----|-----------------------------------------------------------------------------|----|
|     | overlapping input reconvergent pulses                                       | 36 |
| 4-3 | Distributed RC interconnection tree                                         | 38 |
| 4-4 | Node density and SEMTs increase as technology downscales                    | 43 |
| 4-5 | Illustration of the sensitive regions upon the physical layout and their    |    |
|     | dependency on the logic input values of two basic logic cells, (a) In-      |    |
|     | verter and (b) NAND2                                                        | 44 |
| 4-6 | Juxtaposition of (a) netlist-based and (b) layout-based identification      |    |
|     | of adjacent gates revealing the inaccuracy of the former approach           | 46 |
| 4-7 | Snippet of (a) COMPONENTS and (b) NETS block statements of a                |    |
|     | DEF file                                                                    | 46 |
| 4-8 | Mapping of the diffusion areas depending on the standard cell orientation.  | 48 |
| 4-9 | Overall flow of the SER estimation framework                                | 52 |
| 5-1 | HSPICE simulation flow for verification                                     | 57 |
| 5-2 | Impact of the masking effects on the propagation of radiation-induced       |    |
|     | SETs through some benchmark circuits for (a) 45nm and (b) 15nm              |    |
|     | technologies, and (c) the corresponding failure probabilities               | 63 |
| 5-3 | Failure probabilities of the benchmarks considering SETs and SEMTs          |    |
|     | for both 45nm and 15nm technologies                                         | 64 |
| 5-4 | Percentage of injected faults inducing SETs or SEMTs and being latched      |    |
|     | by a FF, and the corresponding failure probability of some benchmarks       |    |
|     | for (a) 45nm and (b) 15nm technology                                        | 65 |
| 5-5 | Illustration of the SER hotspot circuit regions of $s15850$ and $s35932$    |    |
|     | benchmarks for both 45nm and 15nm technologies                              | 67 |
| 5-6 | Distribution of the gates depending on their sensitivity (GLP values)       |    |
|     | for some benchmarks                                                         | 68 |
| 5-7 | Impact of operating temperature on the failure probabilities of some        |    |
|     | benchmarks                                                                  | 69 |
| 5-8 | Failure probability convergence                                             | 73 |

| 6-1 | Structure of a typical bulk nMOSFET device                            | 76 |
|-----|-----------------------------------------------------------------------|----|
| 6-2 | Structure of typical (a) PDSOI and (b) FDSOI nMOSFET devices          | 77 |
| 6-3 | 3D illustration of a heavy ion strike on the drain of nMOSFET and     |    |
|     | the induced charge generation along its track                         | 81 |
| 6-4 | SET drain current for various LETs and both Bulk and FDSOI tech-      |    |
|     | nologies when (a) nMOSFET and (b) pMOSFET are affected                | 82 |
| 6-5 | nMOS drain current pulses for various LETs of (a) Bulk and (b) FDSOI  |    |
|     | technologies                                                          | 83 |
| 6-6 | SET drain current for three different strike angles and both Bulk and |    |
|     | FDSOI technologies when (a) nMOSFET and (b) pMOSFET are affected.     | 84 |
| 6-7 | Transient plot of a CMOS Inverter when two heavy ions strike the      |    |
|     | nMOS and pMOS transistor at different time moments                    | 84 |



# List of Tables

| 4.1 | Propagation delays and output pulse widths for a NAND2 gate and for                                                                                                                       |    |
|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
|     | both transitions considering 100ps, 300ps, and 500ps input pulse widths                                                                                                                   | 34 |
| 4.2 | Average affected area                                                                                                                                                                     | 47 |
| 5.1 | Number of faults injected for each benchmark                                                                                                                                              | 56 |
| 5.2 | Verification results of failure probability                                                                                                                                               | 58 |
| 5.3 | Comparison of the proposed electrical and timing masking models with HSPICE on various SET pulse propagation paths                                                                        | 58 |
| 5.4 | Failure probabilities considering the closed-form approach and the NLDM-based approach for electrical masking and for both 45nm and 15nm technologies on a subset of ISCAS '89 benchmarks | 60 |
| 5.5 | Failure probabilities considering LE, NLDM and RC Interconnection approaches for both 45nm and 15nm technologies on a subset of ISCAS '89 benchmarks                                      | 61 |
| 5.6 | Qualitative comparison of the proposed tool with other state-of-the-art SER estimation approaches                                                                                         | 70 |
| 5.7 | Runtimes and speed-up of SER estimation execution on some ISCAS '89 benchmarks                                                                                                            | 74 |
| 6.1 | Physical parameters of nMOS devices                                                                                                                                                       | 79 |

A.1 SER evaluation results, obtained from the proposed tool, for the IS-CAS '89 benchmarks and 45nm technology. The number of nodes, primary inputs, gates and D-FFs indicate the benchmark complexity, Fail. Rate and FIT denote the SER as failure probability and in terms of FIT, respectively, whereas Ex. Time is the average execution time.

90

A.2 SER evaluation results, obtained from the proposed tool, for the IS-CAS '89 benchmarks and 15nm technology. The number of nodes, primary inputs, gates and D-FFs indicate the benchmark complexity, Fail. Rate and FIT denote the SER as failure probability and in terms of FIT, respectively, whereas Ex. Time is the average execution time.

91

## List of Abbreviations

BPSG BoroPhosphoSilicate Glass

CMOS Complementary Metal-Oxide-Semiconductor

DEF Design Exchange Format

EDA Electronic Design Automation

FDSOI Fully Depleted Silicon On Insulator

FF Flip-Flop

FinFET Fin Field-Effect Transistor

FIT Failures In Time

GDSII Graphic Design System II

GLP Glitch Latching Probability

IC Integrated Circuit

LEF Library Exchange Format

LET Linear Energy Transfer

LUT Look-Up Table

MSET Multiple Single Event Transient

MOSFET Metal-Oxide-Semiconductor Field-Effect Transistor

MTTF Mean Time To Failure

NLDM Non-Linear Delay Model

SEE Single Event Effect

SEMT Single Event Multiple Transient

SER Soft Error Rate

SET Single Event Transient

SEU Single Event Upset

SOI Silicon On Insulator

TCAD Technology Computer-Aided Design

VLSI Very Large-Scale Integration

# Chapter 1

## Introduction

## 1.1 Motivation

The motivation behind this dissertation derives primarily from the constant reliability concerns in the VLSI field regarding the radiation-induced Soft Errors. Besides, due to the technology shrinking, which entails higher operating frequencies and lower supply voltages, cell capacitances and critical charges (the minimum required charge to induce a node upset,  $Q_{crit}$ ), rendering the modern chips more susceptible to such hazards, the need for a comprehensive analysis of the soft errors and their impact on IC's operation becomes even more imperative and the contribution of simulation tools to the development of robust and error-tolerant chips tends to be significant. It has been shown that the atmospheric neutron Soft Error Rate (SER) in SRAM cells increases with the decreasing of device feature sizes [1], whereas an increasing susceptibility of sequential and static combinational devices to alpha particles and an accretion of multi-bit upset probabilities have been observed with technology scaling [2], despite the fact that the reduction in charge collection area has been reported to compensate for the critical charge scaling trend. Also, the technology trends are expected to result in a considerable increase in the soft error rates in combinational logic, compared to sequential logic [3, 4]. The significant reduction in the critical charge of the logic circuits, as device feature size decreases, has a massive impact on the SER of the combinational logic, as shown in Figure 1-1. Several error correction



Figure 1-1: CMOS technology generation trend of (a) critical charge  $(Q_{crit})$  and (b) Soft Error Rate (SER) of SRAM cells, latches, and combinational logic [3].

schemes that have emerged are able to protect the memories from soft errors, although it has been reported that there is a saturation in the soft error mitigation [5]. Thus, an accurate and effective framework for the evaluation of circuit vulnerability in combinational logic, through the integration of EDA tools that simulate sufficiently the soft error radiation-inducing mechanisms, may be further exploited in the design process of error-resistant deep submicron and nanoscale systems through the development and employment of radiation-hardening techniques. At the same time, the emergence and evolution of unconventional silicon semiconductor technologies and devices (e.g., PDSOI or FDSOI wafers, FinFETs, etc.), which are utilized from the state-of-the-art microelectronic ICs, may differentiate both the reliability evaluation process—constituting a challenge and an open field of research—and the outcomes.

## 1.2 Author Contribution

#### 1.2.1 Contribution

In this dissertation, we have dealt with the radiation-induced soft errors in the combinational logic of ICs. An integrated algorithm for SER estimation accounting for multiple transient faults is developed. The implemented tool, which is based on Monte Carlo simulations, models and incorporates the fundamental parts of a SER estimation in combinational logic, that is the three masking mechanisms. We exploit the results of a Single Event Transient (SET) pulse characterization process and the detailed layout information of the designs to achieve an accurate reliability assessment. Note that the terms SET and transient fault are equivalent, and thus they are used interchangeably in this dissertation.

A significant part of this research is the results regarding the SER of various benchmarks. In particular, the whole suite of ISCAS '89 benchmark circuits are used to demonstrate their susceptibility to ionizing radiation. Also, we perform a topological analysis of the circuit physical layout to obtain the most vulnerable circuit sites, allowing for selective slight modifications of the logic cells' placement to combat the high soft error rates. Moreover, the gate sensitivity is identified through a metric, called Glitch Latching Probability (GLP), which represents the impact of masking mechanisms on SET pulse propagation. Finally, a comparison of our work with other similar works with respect to the factors that are taken into consideration for SER evaluation from each one of them is reported. The verification of the results is conducted with HSPICE demonstrating a quite close convergence.

While our main work focuses on the analysis of soft errors in the conventional Bulk CMOS technologies, the emergence of cutting-edge technologies, such as SOI, FDSOI and FinFET, has raised the interest in examining the impact of ionizing radiation on such technologies. Thus, we conclude this work presenting various results of mixed-mode and 3D TCAD simulations, which characterize the behavior of FDSOI devices when a heavy-ion strike occurs. The results are compared with those of the Bulk technology indicating which of them are more susceptible to such hazards.

#### 1.2.2 Publications

Throughout the doctoral studies the author has submitted, published and presented his work in various international conferences and proceedings as listed below.

• G. I. Paliaroutis, P. Tsoumanis, G. Dimitriou and G. I. Stamoulis, "SER Analysis for Multiple Affected Gates". International Conference on Computer Science,

- Computer Engineering, and Social Media (CSCESM), Special Session on High-Level Synthesis, CAD and Applications, December 12–14, 2014, Thessaloniki, Greece.
- G. I. Paliaroutis, P. Tsoumanis, G. Dimitriou and G. I. Stamoulis, "SER Analysis of Multiple Transient Faults in Combinational Logic". ACM SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM), September 25–27, 2016, Kastoria, Greece.
- G. I. Paliaroutis, P. Tsoumanis, G. Dimitriou and G. I. Stamoulis, "Placement-aware Simulation of Multiple Transient Faults in Combinational Logic". Accepted for publication at North Atlantic Test Workshop (NATW) 2017.
- G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "Placement-based SER estimation in the presence of multiple faults in combinational logic". 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), September 25–27, 2017, Thessaloniki, Greece.
- G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "A Placement-Aware Soft Error Rate Estimation of Combinational Circuits for Multiple Transient Faults in CMOS Technology". IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), October 8–10, 2018, Chicago, IL, USA.
- G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "Multiple Transient Faults in Combinational Logic with Placement Considerations". 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), May 13–15, 2019, Thessaloniki, Greece. (Best Paper Award nominee)
- C. Georgakidis, G. I. Paliaroutis, N. Sketopoulos, P. Tsoumanis, C. Sotiriou, N. Evmorfopoulos and G. Stamoulis, "A Layout-Based Soft Error Rate Estimation

and Mitigation in the Presence of Multiple Transient Faults in Combinational Logic". 21st International Symposium on Quality Electronic Design (ISQED), March 25–26, 2020, Santa Clara, CA, USA. (Virtual)

 G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, and G. Stamoulis, "On the Impact of Electrical Masking and Timing Analysis on Soft Error Rate Estimation in Deep Submicron Technologies". Accepted in Work-in-Progress Poster Session of Design Automation Conference (DAC), December 5–9, 2021, San Francisco, CA, USA.

His publication history includes as well a journal.

G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou, and G. Stamoulis, "SET Pulse Characterization and SER Estimation in Combinational Logic with Placement and Multiple Transient Faults Considerations". Technologies 2020, 8, 5.

## 1.3 Outline

This PhD dissertation is organized as follows. Firstly, Chapter 1 introduces the motivation for this work and highlights the main author's contributions. Chapter 2 presents the background regarding the radiation-induced soft errors. Chapter 3 presents the relevant work that exists in the literature so far. Chapter 4 introduces the SER estimation methodology elaborating the algorithms and models that are implemented and incorporated into the proposed tool. Chapter 5 presents various simulation results performed on the ISCAS '89 benchmark suite as well as the verification framework with SPICE. A SET pulse characterization of FDSOI technology with a TCAD simulation software is presented in Chapter 6, and finally, Chapter 7 concludes this dissertation indicating some useful remarks.

# Chapter 2

## Background

In the past few decades, the radiation-induced soft errors have become one of the major issues regarding ICs reliability. Nowadays, this concern still remains in the spotlight as the semiconductor industry progress into the deep sub-micron technologies has rendered the chips more vulnerable to such hazards. This chapter presents an overview of the radiation-induced soft errors. First, the prevalent causes of soft errors in submicron devices are highlighted. Then, the radiation-induced SET generation mechanism in silicon is presented, indicating some critical aspects of the ionization and modeling such incidents in logic level. Finally, some of the well-known mitigation techniques to cope with the soft errors are outlined.

## 2.1 Main Causes of Soft Errors

The soft errors, which are also known as Single Event Upsets (SEU), constitute a subcategory of a wide classification called Single Event Effects (SEE), that is, upsets caused by an event, such as a single energetic particle [6]. The emergence of SEUs in spacial applications was initially observed in the middle of the 70s, when the authors in [7] reported anomalies in communication satellite operation, which resulted from the interaction of the galactic cosmic rays with the digital circuits causing the triggering of the Flip-Flops (FFs). However, the first tangible studies regarding soft errors occurring in terrestrial electronics emerged a few years later. Such errors, which

constitute a temporary, non-destructive effect that differs from the permanent of the hard errors, are mainly caused by alpha particles, energetic neutrons produced from cosmic radiation, and thermal neutrons.

### 2.1.1 Alpha particles

The authors in [8] introduced the alpha-particle-induced soft errors in dynamic memories. Particularly, they showed that the alpha particle emissions due to the radioactive decay (or alpha decay) of low concentrated impurities in the packaging materials of ICs, such as uranium and thorium, are able to penetrate the silicon creating enough electron-hole pairs that eventually cause single-bit flips. An alpha particle consisting of two protons and two neutrons bound together is identical to that of an Helium-4 nucleus and is symbolized as  $\alpha$ ,  $^4_2\alpha$  or  $^4_2$ He. The energy of the emitted alpha particles varies from 3 MeV to 7 MeV (average kinetic energy of 5 MeV), which is sufficient enough to provoke a disturbance, given that the minimum required energy to generate an elementary charge (electron-hole pair) is 3.6 eV in silicon. Thus, a single alpha particle is capable of generating about a million of electron-hole pairs across a path of just a few microns (2 to 3 microns) in length, resulting in the corruption of stored information, or else a soft error. Moreover, it should be noted that the emission rate of such particles depends on the purity of the materials used in fabrication process. In [9], the collected and critical charges due to alpha particle strikes were modeled, and the charge limits alteration as technology downscales indicated that the soft failure probability increases.

#### 2.1.2 Cosmic radiation

A prominent work presented in [10, 11] and followed by [12] some years later, revealed that cosmic radiation constitutes a significant threat for terrestrial electronic devices. In particular, the high-energy cosmic rays that bombard the earth interact with earth's atmosphere particles and a number of secondary particle cascades are produced, which, in turn, create further cascades, as shown in Figure 2-1.



Figure 2-1: Representation of various particle cascades as high-energy cosmic rays interact with earth's atmosphere particles [13].

In fact, due to the intensity of the collision between the cosmic rays and the atmospheric atoms, less than 1% of the primary particles eventually reach the sea level. Therefore, the terrestrial cosmic rays, that is the rays that are capable of reaching the earth, are primarily the cascade particles of third generation and above. The cascade particles consist of electrons, protons, photons, pions, muons and energetic neutrons. The last, which prevail at sea level (about 95% of the terrestrial particles, with the rest being protons and pions), may cause significant soft fails in electronics. Even if they are not charged, their interaction with the silicon nuclei in a chip may induce their fission producing several secondary fast heavy particles, which are able to generate, in the vicinity, electrical charges of more than four times as much as that of alpha particles. Such charge bursts are sufficient to upset a semiconductor device resulting in soft errors eventually. At high terrestrial altitudes, the percentage of protons and pions increases significantly and, as the cosmic ray intensity is greater, the

soft fails are more probable to occur in comparison with sea level. In addition, it has been reported that the extensive shielding of electronic devices with concrete is able to eliminate such soft fails, as the cosmic ray intensity approaches zero in such a case. The outcomes of this extensive research allow for SER evaluation in any location on earth, since the particle flux is known with high accuracy both at high altitudes and sea levels.

#### 2.1.3 Other sources

Except for the alpha particles and the energetic neutrons from cosmic rays, there are some further sources of soft errors. The cosmic ray thermal neutrons, that is the free neutrons—those that are not bounded in an atomic nucleus—having a low kinetic energy after reaching thermal equilibrium with the surrounding materials, may potentially pose a significant threat to the proper operation of semiconductor devices. Thermal neutrons feature higher cross-section in fission of certain materials, compared to high-energy neutrons, thus resulting in the emission of reaction products. The study presented in [14] showed that the interaction of thermal neutrons with isotopes of Boron <sup>10</sup>B is a significant source of soft errors. The <sup>10</sup>B nucleus is extremely unstable and when it captures a thermal neutron the resultant fission releases an excited <sup>7</sup>Li nucleus, an alpha particle and a gamma photon, which may subsequently generate a charge in the silicon and provoke a soft error. In semiconductor device fabrication, boron serves as a p-type dopant and is used in borophosphosilicate glass (BPSG) as well. BPSG layers are widely utilized as top passivation layers of semiconductor devices and as intermetal dielectric insulating layers. The <sup>10</sup>B fission in the BPSG is reported to be a significant source of soft errors in BPSG-based electronic devices [15, 16], as the <sup>10</sup>B concentration is higher compared to the other implant lavers. Thus, the utilization of the alternative <sup>11</sup>B isotope—which has a lower capture cross-section that contributes less to soft errors—for the BPSG of the critical designs can be a drastic radiation-hardening technique to mitigate SER. In advanced CMOS technologies, however, the copper interconnects that have superseded the aluminum interconnects are subject to a different fabrication process, rendering the use of BPSG obsolete, and thus such sources of soft errors are regarded insignificant.

Finally, a minor source of soft errors originates from signal integrity issues, such as crosstalk between neighboring nets due to the cross coupling capacitance. Nevertheless, the contribution of such sources to the overall SER is rather insignificant when compared to radiation hazards.

### 2.2 Soft Errors on Silicon

#### 2.2.1 Radiation interaction with silicon

To proceed further in the analysis of the radiation-induced soft errors in the combinational logic of ICs, it is important to comprehend the basic mechanism of ionizing particles striking the silicon, regarding their interaction with the material and the immediate consequences on the proper device operation. The fundamental structural components of modern electronics are the transistors and, particularly, the MOSFETs which have been the dominant semiconductor devices since the 1960s. Nowadays, an IC chip, such as a microprocessor, consists of some billions of transistors which is a result of the device scaling and miniaturization, and the rapid growth of semiconductor technology in general. Thus, a potentially harmful ionizing particle may strike one or more of these transistors and affect the proper operation of the chip.

Suppose the MOSFET device (in particular NMOS transistor) of Figure 2-2 being stricken by a high-energy particle at the p-n junction area [17]. As the particle traverses through the silicon, a cylindrical track of electron-hole pairs is created and a series of physical processes take place contributing to the overall charge collection process. Upon the particle strike, a drift current, i.e. a movement of the deposited charge carriers, towards the p-n junction instantaneously emerges due to the electric field of the depletion region. At the same time, due to the IR drop in the substrate, being a result of the high current density, the charge potential, i.e. the depletion region, extends into the substrate forming a funnel, and thus enhancing the collection of the generated electrons in the well. Therefore, a substantial current emerges on the



Figure 2-2: Charge generation and collection due to a particle strike on a reverse-biased p-n junction. (a) Generation of electron-hole pairs along the ion traversal, (b) quick drift charge collection and extension of the depletion region towards the substrate forming a funnel, (c) slow diffusion charge collection, and (d) the generated current pulse on the transistor contact throughout this process [17].

particular device node. Afterwards, a slower diffusion collection process supersedes the fast and predominant drift collection. Due to the spatially varying charge concentration in the silicon between electrons and holes, the excess electrons diffuse and are collected from the depletion region. The overall current collected from the transistor's internal contact, throughout the whole process, is presented in Figure 2-2(d).

Additionally, there is another mechanism that contributes substantially to the overall charge collection. In the context of the previous illustration of charge collection process, while the electrons are collected rapidly from the drain (drift phase), excess holes in the well raise the well potential and lower the source/well potential barrier resulting in the injection of electrons from the source into the p-well. Thus, the electrons may be collected from the drain causing an increase in the total collected charge. This mechanism is the well-known parasitic bipolar effect (or amplification) due to the injection of the electrons over the source/well barrier acting like a transistor, where the source is the emitter, the channel is the base, and the drain is the collector of the transistor [18]. The parasitic bipolar effect has been found that contributes about 30% to the total collected charge in both 130-nm and 90-nm technologies,

whereas increases with rising temperature [19]. This indicates the significance of the parasitic bipolar effect to the radiation-induced generated current pulse, let alone with the continuous technology downscaling.

### 2.2.2 Linear Energy Transfer

Another problem of technology downscaling is that radiation particles of smaller energy, which usually have low penetration depth, may potentially cause transient faults by ionizing the atoms or molecules of the matter that they strike. The Linear Energy Transfer (LET) is an energy loss metric and, particular, the amount of energy per unit distance that a particle deposits, through ionization, to the material (i.e., semiconductor in this case) along its track. In other words, it is the retarding force that the ionizing particle gets as it penetrates the matter, and equals:

$$LET = \frac{dE}{dl} \tag{2.1}$$

where dE is the average energy locally imparted to the medium by a charged particle of specified energy when traversing a distance dl. LET is a positive quantity and is generally expressed in MeV/cm or  $MeV \cdot cm^2/mg$ , which results from the division with the material density  $mg/cm^3$ , whereas it depends mostly on the type of radiation particle, the traversed material and the angle of the strike. Also note that it is convenient to convert LET into charge per unit length (i.e., pC/mm) so as to be compared with the device physical dimensions and critical node charge stored.

The ion energy, the energy loss (LET), and the depth (range) that the ion traverses are closely related. In particular, LET increases with increasing energy until reaching a maximum value, called Bragg peak, and then decreases with increasing energy. A Bragg peak is observed also in the plot of the energy loss of ionizing radiation during its travel through matter. In particular, the energy loss increases with the distance that the particle travels and reaches a maximum immediately before the end of the particle's path, as shown in Figure 2-3. The ion range is an important parameter as it should be sufficient to provoke charge collection into the silicon and depends on



Figure 2-3: Energy loss for alpha particles of 5.49 MeV moving through air.

the energy of the particle. As the energy increases, which means lower LET, Bragg peak moves deeper into the material and broadens and decreases. Therefore, this analysis demonstrates the interdependence of the individual factors and, mainly, the significance of LET in radiation effects.

## 2.2.3 Modeling of the ionizing particle strike

The overall collected charge, as a result of an ionizing particle striking the silicon p-n junction of a submicron device, corresponds to a current that emerges on the respective device contact. The charge collection current is sufficiently modeled with a widely-known approximation model, which utilizes a double exponential waveform and can be expressed as follows [20, 21]:

$$I_{particle}(t) = \frac{Q_{coll}}{\tau_{\alpha} - \tau_{\beta}} (e^{-t/\tau_{\alpha}} - e^{-t/\tau_{\beta}})$$
 (2.2)

where  $Q_{coll}$  denotes the total collected charge from the p-n junction,  $\tau_{\alpha}$  is the time constant for the electron-hole pairs deposition in the p-n junction, and  $\tau_{\beta}$  is the time moment of the ion-track establishment, being just a few picoseconds and much less than  $\tau_{\alpha}$ . Generally,  $Q_{coll}$  depends on the type and energy of the particle that strikes the silicon, the angle of the strike, the proximity of the strike to the p-n junction, the



Figure 2-4: Generated double exponential current pulse.

temperature, the bias conditions, and the semiconductor device characteristics, such as the technology node, the supply voltage, the doping concentration, etc. The total  $Q_{coll}$  is equal to the area under the current curve, as shown in Figure 2-4. Therefore, given the generated current pulse,  $Q_{coll}$  can be obtained from the integral of the  $I_{particle}$  with respect to time, as follows:

$$Q_{coll} = \int_0^t I_{particle}(t)dt \tag{2.3}$$

Moving from the device level to a higher abstraction level, i.e., transistor level, the incident of an ionizing particle strike can be simply modeled by connecting the double exponential independent current source, shown in Equation 2.2, to a particular transistor contact. Depending on the transistor that the particle strikes, the contact that the current source is attached to, and its direction vary. Note also that an ionizing particle may potentially affect a transistor solely when the transistor is inactive (or in cut-off state), which depends on the logic input state. Figure 2-5 presents the schematics of an inverter while a radiation particle strikes (a) the inactive nMOS and (b) the inactive pMOS transistor. To be more specific, an ionizing particle striking the inactive nMOS transistor is modeled with a current source attached to its drain with the direction being from the drain contact to the body contact (Figure 2-5(a)). Similarly, a particle striking the inactive pMOS transistor is modeled with the current source attached to its drain, but with the direction being from the body contact to

the drain contact (Figure 2-5(b)).



Figure 2-5: Modeling of the radiation particle strike with a current source when the strike occurs on (a) nMOS transistor and (b) pMOS transistor of an Inverter.

The double exponential current source model is used for the electrical simulation of transient faults with SPICE due to radiation particle strikes on sensitive circuit nodes, allowing for the circuit susceptibility evaluation to such hazards and the characterization of the transient faults, accounting for various factors, such as supply voltage, temperature, load capacitance, etc. In combinational logic, the injected current pulse may appear at the output of the logic gate as a voltage pulse or glitch (i.e., a prompt change of the logic state). The generated pulse may exceed the threshold level, that is the half of supply voltage, and temporarily settle to logic 1 or logic 0. Figure 2-6 shows an ionizing particle that strikes the inactive pMOS transistor of a CMOS Inverter. A glitch will emerge at the output of the gate as a result of the capacitor charging that succeeds the charge generation. Upon the ending of charge collection, the signal recovers to logic 0 through the discharging of the capacitor, indicating the end of the phenomenon.

However, there is a condition that should be met, that is, the resultant collected charge exceeds the minimum amount of charge required for a particle to induce the change of the output logic state. This charge is called critical  $(Q_{crit})$  and is mostly related to the device characteristics (e.g., manufacturing technology, doping level,



Figure 2-6: Particle striking the off pMOS of a CMOS Inverter.

device size, transistor capacitance, etc.), whereas can be estimated as follows:

$$Q_{crit} = C_{node} \cdot V_{dd}/2 \tag{2.4}$$

where  $C_{node}$  is the capacitance node and  $V_{dd}$  the supply voltage. A more accurate approximation of  $Q_{crit}$  can be obtained from electrical simulations with SPICE or device-level simulations with TCAD tools.

The value of  $Q_{crit}$  is significant and decisive for the emergence of transient faults and soft errors potentially. Thus, it may be associated with the technology nodes acting as an indicator of the device susceptibility to radiation hazards. However, the continuous shrinking of transistor geometries along with the supply voltage scaling result in decreasing values of  $Q_{crit}$ , which increases the soft error susceptibility of modern ICs raising the concerns regarding the reliability of future submicron technologies [22].

Generally, a radiation particle striking a sensitive silicon node induces different results depending on the circuit that this event emerges. In memories, a soft error emerges when the read operation occurs after the storage state value is modified. In SRAM memories, the particle strike may result in flipping the logic state of the cell, whereas in DRAM memories, the generated charge may affect the read or write operations due to the upset of the cell's logic and control circuits. On the other hand,

in combinational logic, the generated pulse at the gate output, called SET, may propagate along the forward cone paths of the circuit and be eventually captured by the storage elements, such as latches and FFs, inducing a soft error. However, the soft error inducing mechanism in combinational logic is much more complicated as there are individual mechanisms, i.e., masking effects, that occur and determine the emergence of soft errors.

# 2.3 Soft Error Rate Mitigation

Since the reliability of modern ICs is non-negotiable, constituting a major concern in the semiconductor industry, the development of methods and techniques to deal with the radiation-induced soft errors is indispensable. Over the recent decades, many approaches have emerged attempting to mitigate the SER of ICs. In general, certain techniques exist in the bibliography and are divided into hardware-level (system-level, device-level and circuit-level) and software-level techniques [23]. Since the SEEs constitute a hazard that primarily affect the proper operation of the circuit, the hardware-level mitigation techniques are more effective and prevail in this field of research. However, in some cases, alternative software-level mitigation schemes are selected to cope with the excessive hardware cost and power dissipation that the former techniques entrain.

At system-level, the radiation-hardening techniques that dominate include the addition of redundancy either in logic or memory circuits. In logic circuits, the Triple Modular Redundancy (TMR) can be utilized by replicating the hardware three times and filtering out any erroneous value with a majority voting logic. Usually, the area overhead is excessive, but this option is sometimes preferred to fully protect critical systems. As regards the memory circuits, a traditional technique to protect them against SEUs is to append a parity bit to the memory word, which is able to detect such upsets. Also, additional circuitry is needed to correct the detected errors, known as Error Correction Code (ECC). However, it may result in significant undesirable area cost and power penalties, especially when it comes to correction of multiple bit

errors, thus being difficult to be applied.

The purpose of the device-level mitigation techniques is to reduce the amount of collected charge due to a high-energy particle strike on the silicon by means of modifying the fabrication process. The most well-known approach is the Silicon-On-Insulator (SOI) technology, which is a variation of the conventional Bulk technology. In particular, the fabrication process differentiates by including an additional ultrathin insulating layer from silicon dioxide (SiO<sub>2</sub>), called Buried Oxide (BOX), within the substrate providing a different type of wafer. In this way, the silicon gap between the source and drain junctions decreases, preventing the development of high-collected charge values, and thus producing smaller SETs [24]. Nevertheless, this kind of mitigation technique is expensive in terms of time and cost, as it requires additional steps in the fabrication process.

Consequently, the interference in the hardware of the design constitutes the basic means of SER mitigation. At circuit-level, there are approaches that include the utilization of additional hardware to reduce the consequences of SEEs and harden the circuit. Several techniques that add spatial redundancy exist, such as the TMR method. The TMR module consists of three identical instances of the logic gate that need to be protected, utilizing a voting circuit (i.e., voter) that filters out the SET and propagates the correct logic value. Others, attempt to modify the delay of the SET through additional hardware (e.g., buffers and voting circuits) so as it cannot be latched from memory elements. There are also hybrid techniques that combine spatial and temporal redundancy methods to moderate the area overhead and operating frequency degradation. Finally, other radiation-hardening approaches increase the node capacitance or the driver's transistor sizes to avoid the excessive area and delay overheads.

In light of the overhead that such methods entrain, it is important that the radiation-hardening should be applied on just a small proportion of a circuit to avoid performance downgrading. Therefore, to selectively harden a circuit, a reliable and accurate evaluation of the vulnerability of ICs to radiation-induced hazards can be the determinant factor in the development of reliable electronic systems, through the

employment of radiation-hardening techniques.

# Chapter 3

# Related Work

Over the past years the ever-growing interest on the study and research of radiation-induced soft errors and their mechanisms has led to significant contribution and progress in this field. Since the relevant literature is vast, in the following paragraphs we will refer to the most representative studies that constitute a breakthrough and the works that have assisted in the further advancement of the academic research.

The first primitive approaches to estimate the SER in CMOS logic circuits emerged in the early 1990s [25, 26, 27]. In [25] the authors propose an algorithm to evaluate the propagation probabilities of single-event-induced errors modeled with closed-form equations. A method for predicting the SER due to  $\alpha$ -particle incidents of CMOS circuits with dynamic registers is presented in [26] and applied to a pipelined multiplier using data of future submicron technologies. In [27] a more advanced computer program, called SEMM, calculates the failure probabilities due to  $\alpha$ -particles and cosmic ionizing radiation, modeling the transient charge collection at the semiconductor junctions and contacts. A switch-level fault simulation algorithm of transients in CMOS is presented in [28]. The algorithm predicts the transient pulse widths, using a first order RC model, and models their propagation to subsequent logic blocks, setting appropriate propagation rules and eventually reducing fault simulation time.

Meanwhile, the gate-level simulation methodologies began gradually succeeding the old-fashioned and time-expensive electrical-level simulations. An intermediate approach of a transient fault simulator, using a PieceWise Quadratic approximation of the injected current pulse for the analytical solution of the MOS differential equations, is introduced in [29]. The authors in [30] present a sophisticated gate-level transient fault injection simulation environment, which uses logic-level and latching operation models to propagate the fault effects to the latch outputs. The majority of the works existing in the literature are based on the modeling and implementation of the three masking mechanisms that inherently mitigate the negative impact of radiationinduced faults on circuit reliability, i.e., logical, electrical, and timing masking [31, 32, 33, 34, 35, 36, 37, 38. Other works are based on probabilistic models to evaluate the SER [39, 40]. In [41] the authors perform a block-based SER analysis, based on an analytical expression for the electrical masking modeling. The authors in [42] evaluate the soft error circuit reliability through the development of a general computational framework, based on probabilistic transfer matrices (PTMs). In particular, matrix calculations are applied and Algebraic Decision Diagrams (ADDs) are utilized to form an overall circuit PTM, allowing for accurate reliability evaluation of large circuits. In [43] a balanced combination of probability and graph theory, circuit simulation, and fault simulation is employed, whereas a statistical method for SER estimation through closed-form models is presented in [44]. The statistical SER framework in 45 takes into consideration the full-spectrum of charge collection and achieves a fast result, exploiting SVM models for cell characterization, instead of LUT methods. The approach in [46] presents a statistical SER analysis, based on two frameworks—tablelookup and SVR-learning—to cope with the underestimation of SER due to the pulse width fluctuation after propagation. The authors in [47] propose a fast signaturebased SER framework that evaluates efficiently logical masking. The contribution of the particle striking time and multi-cycle effects to an accurate and efficient SER analysis, avoiding its underestimation, are discussed in [48].

However, the continuous process technology shrinking, resulting in the reduction of the distance among the logic cells, has made multiple transient faults, caused by single particle strikes, more prevalent, thus increasing circuit vulnerability [49, 50, 51]. Particularly, in [51] the authors perform a detailed characterization to quantify the likelihood of a SET to cause multiple bit-flips in logic circuits, proving that the

probability is quite significant, especially as technology scales, and therefore, such type of errors should be taken into account in order to obtain a realistic fault model for soft errors. Though none of the aforementioned works take into consideration this trend, a number of research studies on the evaluation of SER, considering Single Event Multiple Transients (SEMT), have emerged recently. Generally, there are two types of approaches that predominate in the literature: (i) the non-layout-aware [52, 53, 54] and (ii) the layout-aware [55, 56, 57, 58, 59, 60, 61]. The former works consider that SEMTs occur at the output of the physically adjacent gates, which are identified at the gate-level by examining fan-outs and fan-ins. Nevertheless, if only logic-level netlist is used for the determination of circuit error sites, neglecting the layout-level adjacency of the cells, may result in inaccurate estimation. Also, the authors in [54] assume that no more than three adjacent gates can be affected by a single event, which may eventually underestimate the SER evaluation. On the other hand, the latter provide a more realistic and reliable SER estimation analysis by taking into consideration the circuit layout. An intermediate approach that takes into account single and double transient faults on random nodes is presented in [62], focusing on the probabilistic logic simulation, albeit neglecting the electrical and timing masking effects, resulting in inaccuracies of the failure probability. The layout-aware approaches in [56, 57, 63] introduce the concept of sensitive regions of a cell that determines the fault generation. A recent unconventional approach, presented in [64], proposes a compact and fast SER estimation physical model in space environment, which relies solely on experimental cross-sectional data and LET spectrum of the ionizing particle.

Except for the methodologies that focus on the evaluation of SER, various approaches that combine the SER estimation and mitigation also exist in the bibliography. In [65] a combination of a sensitivity-based gate sizing algorithm and a slack-based FF selection are used to achieve SER reduction with reasonable costs. A symbolic framework for the circuit reliability analysis that is used for a selective gate resizing to succeed circuit hardening is presented in [66]. In [67] the authors propose a framework for adding and removing redundant wires to eliminate the gates that contribute most to the overall SER, which is enhanced with a gate resizing technique

to further optimize SER robustness. In [68] a sensitivity-based gate sizing methodology is presented, whereas a simulation-based approach in [69] aims at maximizing the logical masking potentiality by extracting sub-circuits and re-synthesizing them. A placement-based radiation hardening technique of removing whitespace between adjacent cells, which have common gates in their forward logic cones, and thus increasing the logical masking probability, is introduced in [70]. In [71] a Monte Carlo-based SER estimation method is presented along with two layout-aware approaches to mitigate SER: the first applies spacing among all the cells, whereas the second converts the most sensitive cells into a TMR structure, abiding by the minimum required distance among the TMR members to protect them against a potential SEMT occurence.

The research in this field is not confined to just simulation methodologies to evaluate the radiation-induced SER. On the contrary, a significant part of the bibliography conducts measurements through neutron beam testing setups that generate particles from a wide energy spectrum either to acquire important data to be used further in simulations, such as SET pulse widths, or to evaluate directly the vulnerability of a circuit.

The authors in [72] utilize the Oracle's neutron beam device for accelerated testing, highlighting the impact of technology scaling to the soft error rates. More specifically, the SEU rate per SRAM cell increases with the technology downscaling, whereas the multi-cell upsets are more prevalent due to the shrinking of the feature sizes. Finally, the negative impact of microprocessor energy reduction techniques on the SEU rate is noticed. In [73] the authors propose a SET pulse width measurement circuit, based on the propagation-induced pulse shrinking, whereas the simulation results of the neutron irradiation reveal the number of SETs, SEMTs, and MSETs (i.e., Multiple SETs) occurred. In [74] heavy ion experiments are conducted to characterize the SEMTs and provide measured results with respect to both cross-sections and pulse widths.

Although the real-time experiments constitute an important step to comprehend the behavior of modern chips into an environment of radiation fluxes, simulations are necessary to succeed scalability and obtain accurate results in a reasonable time. Besides, such simulation tools constitute the principal means of the assessment of an individual method. In [30, 75] the authors characterize the SET pulse generation and propagation under different design parameters through SPICE and TCAD simulations. In [76] the SET pulse width characterization results for 130nm and 90nm CMOS technologies through actual measurements are supported by 3D TCAD simulations, showing that pulse widths increase for the same radiation environment with technology scaling, whereas they are strongly related to the strike location. A built-in-self-test circuit implemented in a 45nm SOI technology is utilized for the heavy ion experiments to measure the SET pulse widths followed by mixed-mode 3D TCAD simulations to validate the results in [77].

# Chapter 4

# Soft Error Rate Estimation

# Framework

A comprehensive framework for SER estimation of modern ICs encompasses a considerable number of complex procedures that should be combined functionally. The subject of this chapter is to present all these necessary parameters and their incorporation to an integrated tool. First, the fundamental element of the proposed tool is presented, i.e., the natural masking mechanisms that are capable of mitigating soft errors. Second, the modeling of these factors and their integration for SER calculation regarding SETs follows. Third, the steps for an efficient and accurate consideration of multiple transients caused from a single particle strike are discussed. And, finally, the flow and the extensive algorithm of the implementation concludes this chapter.

# 4.1 Masking Mechanisms

To develop a reliable and accurate SER estimation framework, first and foremost, it is important to model sufficiently and accurately the masking mechanisms that inherently determine the capability of a circuit to absorb and eliminate soft errors. These factors, which hold a crucial role in the emergence of soft errors are logical, electrical and timing masking. The next subsections describe the masking mechanisms modeling and implementation as they are integrated into a SER estimation tool.



Figure 4-1: A SET resulted from an ionizing particle strike on a NAND2 gate is subjected to the three masking mechanisms, i.e., (a) logical, (b) electrical, and (c) timing, as it propagates through the circuit.

### 4.1.1 Logical masking

The first factor that prevents a transient fault from propagating through the circuit is Logical Masking and its name is due to the circuit logic, that is the logic gates that are used for the implementation. In particular, as a transient pulse at the output of a gate propagates through the forward cone it may arrives at a gate whose other input is in a controlling value. The controlling value is the value at a gate input that is sufficient to determine the gate output. As a result, the transient fault cannot propagate through this gate, and thus is masked. Depending on the type of the logic gate, the controlling value differs. A NAND gate has a controlling value of 0, which means that whenever this value emerges at its input, the output is at logic 1 regardless of the other inputs. For instance, in Figure 4-1(a), the transient fault at the output of the first NAND2 gate is logically masked at a subsequent NAND2 gate due to a controlling value of the other input, that is, logic 0.

# 4.1.2 Electrical masking

As mentioned, as soon as an ionizing particle strikes a gate, the glitch that emerges at the output propagates through the logic gates of the forward cone. However, the nonlinear electrical properties of the CMOS gates might impede it from its propagation. On one side, the pulse itself might not be sufficiently strong, in terms of width and amplitude, i.e., its duration is small, to charge or discharge the output capacitor, thus settling the output voltage level to logic 1 and logic 0, respectively. On the other side, the shape, and thus the width of a glitch is modified due to the difference in rise and fall transition delays. Therefore, the attenuation of the generated glitch that generally occurs as it propagates through the subsequent gates is called Electrical Masking. Obviously, a slow gate has a greater contribution to electrical masking than a fast. In Figure 4-1(b) a pulse generated at the output of NAND2 gate, after a few logic stages, is completely eliminated.

### 4.1.3 Timing masking

Even if a transient fault is not eliminated due to the aforementioned factors, there is, still, another mechanism that could probably do this. Timing or temporal or timingwindow masking refers to the memory elements and, especially, the FFs, which may prevent a transient fault from becoming a soft error. A positive-edge-trigerred FF captures the data input at a specific moment of clock period, that is, at the positive edge of the clock. Yet, the input signal should be held steady before and after the clock edge so as its value is reliably sampled by the clock. Setup time is the time that the input data should remain unchanged before the active clock edge and hold time is the time that the input data should be held steady after the active clock edge. The latching period corresponds to the sum of setup and hold time. Thus, a fluctuation of the signal during this period could result the FF in latching a wrong value. Therefore, timing masking mechanism occurs when the glitch arrives at the FF data input outside of the latching window or latching period. The extent to which the timing masking mitigates the SER depends on the circuit temporal properties, i.e., propagation delay, clock frequency, the timing parameters of the FFs and the shape of the glitch itself, as a result of the propagation until the FF input. However, two factors that are independent of the circuit properties, to a certain extent, are the time moment of the strike incidence inside the clock cycle and the area of the circuit that occurs. Those factors in conjunction could determine significantly the arrival time at the FF, and thus the latching probability. On the other hand, the higher clock frequency, which means smaller period, makes it more probable for the transient faults to become soft errors, since the FF samples the data input more times per time unit. Figure 4-1(c) shows a transient pulse that arrives at the FF input outside the latching window, becoming eventually masked.

# 4.2 Single Event Transients

This section describes the individual parts that are combined for the estimation of SER in combinational logic, including the modeling of the masking mechanisms and the arrangement of pulse propagation issues, such as pulse reconvergence, and delay issues. These parts are implemented and incorporated into the proposed tool to evaluate the latching probabilities and, subsequently, the circuit SER taking into consideration particle strikes that induce solely SETs. A similar method to evaluate the individual gate sensitivity is also presented. Note that the soft errors induced by SEMTs involve a more complicated analysis, and thus a separate section in this chapter is dedicated to it.

# 4.2.1 Masking mechanisms modeling

First and foremost, the accurate modeling of the three masking mechanisms and their efficient implementation, in the context of an integrated SER estimation framework, are of great significance. In the next paragraphs, we describe in detail the way that these mechanisms are incorporated into the framework and we conclude this with an informative and compact pseudocode.

#### Logical masking

A plain implementation of logical masking requires a static logic simulation of the entire circuit. An initial simulation is performed for a random primary input (PI)

vector in order to record the logic values at the output nodes and FF inputs. Subsequently, a transient fault—as a result of a particle strike—is injected on a gate, which is represented by the flip of the gate's output logic value. Finally, another simulation is performed taking into account the faulty logic value, which propagates to the forward logic cone of the affected gate. To detect if logical masking eventually occurred, the logic values at the outputs and FF inputs are observed and compared with the previous ones.

#### Electrical masking

An accurate analysis of SETs, occurring in the combinational logic of a circuit, entails the existence of a precise propagation model for the glitches. Generally, there are two approaches regarding the modeling of the electrical masking effect, both having advantages and disadvantages. The first utilizes a closed-form expression to approximate the actual pulse propagation, whereas the second is based on SET pulse characterization through SPICE simulations.

As regards the first approach, an analytic expression for the modeling of pulse propagation provides a straightforward handling, which is simple and faster, though it lacks of accuracy, since it is an approximation of the actual pulse propagation. For the purpose of this dissertation, a simple linear function [35], depending on the gate propagation delay, is adopted, as shown in Equation 4.1:

$$w_{out} = \begin{cases} 0, & w_{in} < d \\ 2 \cdot (w_{in} - d), & d \le w_{in} < 2 \cdot d \\ w_{in}, & w_{in} \ge 2 \cdot d \end{cases}$$
(4.1)

where  $w_{in}$  is the width (duration) of the glitch at the gate input,  $w_{out}$  is the width of the glitch at the respective output, and d is the gate propagation delay. Given this equation, we are able to conclude that the attenuation of a glitch will be more intense as it propagates through a slow gate, compared to a fast gate. Therefore, slow gates have better glitch attenuation characteristics. Note that the height of the glitch is

considered adequate enough to change the output logic state.

A complete SPICE characterization process, on the other hand, of the logic cells of the library is performed to cope with their different electrical properties and nonlinear behavior of MOSFETs. In particular, a series of simulations for each logic gate is conducted applying a variety of pulses solely to one input, while keeping the other inputs stable, and observing the pulse that appears at the output. The characteristics that differentiate each pulse from the others are the pulse widths and the slews, i.e., the rise and fall times. Moreover, various output capacitances are applied to account for the different fan-outs and interconnection parasitics of each design. Therefore, a number of Look-Up-Tables (LUTs) are formed so as, given a specific pulse at the input of a gate and a capacitance load at the output, it is trivial to identify the pulse at its output by referring to the corresponding LUT entry. If these values do not coincide with the LUT's indices, interpolation is performed to acquire the output pulse width. The parameters that are examined in order to form the LUTs are the pulse width, the rise and fall delays, and the output capacitance of the gate. Generally, the advantage of this approach is that it provides a high accuracy, since the results of an electric-level simulator are recorded on the LUTs and utilized for the pulse propagation during the customized SER estimation. However, there are difficulties regarding its implementation as it is almost impossible to account for all SET pulses in terms of their shape characteristics that may emerge in circuit nodes during their propagation through its different paths. In view of the different fanouts and parasitic capacitances of each gate in a circuit, the pre-characterization of the logic gates becomes an expensive process in terms of time due to the vast amount of SPICE simulations that should be conducted for all the logic cells and for all the parameters that are critical in the SET pulse propagation width. At the same time, this characterization process needs to be conducted for each CMOS technology utilized.

An alternative approach that is implemented in this dissertation bypasses the painstaking pre-characterization process and is based on the following inference from recent works. In [78, 79] the authors presented the Propagation Induced Pulse Broad-

ening (PIPB) effect that a SET is subjected to, as it propagates through long Inverter chains, whereas the authors in [80] showed that a SET pulse may attenuate or broaden depending on the gate delay, indicating both works that the broadening of the pulse should be considered for SER estimation.

A certain number of SPICE simulations are performed to observe the SET pulse formulation as it propagates through a logic cell. In particular, pulses of different widths are applied as inputs in various logic cells. Each pulse is modeled as a trapezoidal waveform and different output capacitance is applied to account for the different number of fan-outs that a gate can have as well as the different interconnection parasitics of the output. Table 4.1 presents, indicatively, the simulation results of a NAND2 gate when a SET pulse of different widths emerges on the one input, while the other input is at a non-controlling value so as the pulse is not logically masked. Both the  $0\rightarrow 1\rightarrow 0$  and  $1\rightarrow 0\rightarrow 1$  transitions of the SET pulse as well as different output capacitance values are taken into account. In particular, the low-to-high propagation delay  $(t_{pLH})$  and high-to-low propagation delay  $(t_{pHL})$  of each transition are presented along with the SET pulse widths at the gate output. The  $t_{\rm pLH}$  is the time interval from the point that input reaches the 50% of the supply voltage (Vdd) to the point that output reaches the 50% of Vdd. The  $t_{pHL}$  is defined similarly taking into account the opposite transition. It is obvious that the propagation delays and the output pulse width are directly related, since the absolute difference of  $t_{pLH}$  and  $t_{pHL}$  denotes the deformation of the output pulse. Particularly, for the  $0\rightarrow 1\rightarrow 0$  transition the pulse width broadens by the absolute difference, whereas for the  $1\rightarrow 0\rightarrow 1$ transition the pulse attenuates by the absolute difference, as shown in Equation 4.2:

$$w_{out} = w_{in} \pm |t_{pLH} - t_{pHL}| \tag{4.2}$$

Note that for the  $0\rightarrow 1\rightarrow 0$  transition, as presented in Table 4.1, and for high capacitance values, the output pulses are equal to zero when SET width is 100ps. This is due to the fact that the amplitude of the particular output pulses does not exceed the Vdd/2 transition threshold, which means that it is not sufficient to propagate

to the next stage and settle to a faulty voltage level. Also, it is worth to mention that there is a slight divergence between the measured output pulse and the actual difference between  $t_{\rm pHL}$  and  $t_{\rm pLH}$  delays, which results from the SPICE simulations.

Table 4.1: Propagation delays and output pulse widths for a NAND2 gate and for both transitions considering 100ps, 300ps, and 500ps input pulse widths

|                 |                   | Capacitance  |                |                             |              |              |                             |              |              |                          |
|-----------------|-------------------|--------------|----------------|-----------------------------|--------------|--------------|-----------------------------|--------------|--------------|--------------------------|
|                 | $\mathbf{w_{in}}$ | 1fF          |                |                             | 5fF          |              |                             | 10fF         |              |                          |
|                 |                   | $ m t_{pLH}$ | ${ m t_{pHL}}$ | $\mathbf{w}_{\mathrm{out}}$ | $ m t_{pLH}$ | $ m t_{pHL}$ | $\mathbf{w}_{\mathrm{out}}$ | $ m t_{pLH}$ | $ m t_{pHL}$ | $\mathbf{w}_{	ext{out}}$ |
| $0{\to}1{\to}0$ | 100               | 26.6         | 13.4           | 114.2                       | 76.3         | 45.1         | 132.2                       | -            | -            | 0                        |
|                 | 300               | 26.4         | 13.4           | 313.9                       | 86.6         | 45.1         | 342.5                       | 163.8        | 84.3         | 380.5                    |
|                 | 500               | 26.3         | 13.4           | 513.9                       | 88.3         | 45.1         | 544.1                       | 164.4        | 84.3         | 581.1                    |
| $1{\to}0{\to}1$ | 100               | 13.5         | 26.7           | 87.8                        | -            | -            | 0                           | -            | -            | 0                        |
|                 | 300               | 13.8         | 26.7           | 288.1                       | 43.3         | 86.9         | 257.4                       | 51.4         | 164.8        | 187.6                    |
|                 | 500               | 13.8         | 26.7           | 488.1                       | 44.1         | 86.9         | 458.2                       | 78.4         | 164.8        | 414.6                    |

In conclusion, we are able model the electrical masking and determine the SET pulse width as it propagates through a logic gate, by taking into consideration solely the rise ( $t_{pLH}$ ) and fall ( $t_{pHL}$ ) gate delays, which are calculated dynamically, depending on the logic input values and considering the transition of the pulse. Unlike the previous approach, in this manner, the electrical masking is modeled accurately and with no additional time cost, since on the one side, the SPICE simulations that are regarded time-consuming are utilized just to verify the correctness of this approach and, on the other side, the computation of the corresponding delays is made once, during the timing analysis as this will be discussed in the next sections.

#### Timing masking

The timing masking is the last factor that is examined in the simulation, since it is necessary to know, first, if the transient fault was logically masked and second, the pulse width at the FF inputs. Given the setup and hold times, the pulse width, the moments that the particle strike occurs and arrives at the FF input, and the clock

period, we are able to inspect the timing masking occurrence by checking if the pulse arrives outside the latching window. There are two conditions (4.3) that render a SET arriving at a FF timingly masked, as shown in the equation below:

$$\begin{cases} t_{hit} + d_{prop} > T - t_{setup} & \text{(a)} \\ t_{hit} + d_{prop} + w < T + t_{hold} & \text{(b)} \end{cases}$$

$$(4.3)$$

where w is the width of the transient pulse,  $t_{hit}$  is the time moment that the SET emerges,  $d_{prop}$  is the propagation delay of the SET, T is the clock period, and  $t_{setup}$  and  $t_{hold}$  are the setup and hold times of the FF, respectively. The first condition 4.3(a) checks if the transient pulse violates the setup time rule, that is, the pulse should be stable a minimum amount of time (setup time) before the clock's positive edge. The second condition 4.3(b) checks if the transient pulse violates the hold time rule, that is, the pulse should be stable a minimum amount of time (hold time) after the clock's positive edge. If at least one of these conditions is true, i.e., the latching conditions are not met, the transient fault is masked. At the same time, the amplitude of the pulse should exceed the threshold of Vdd/2 to be latched. If the pulse does not exceed this threshold it is not eventually latched either.

## 4.2.2 Reconvergent transient pulses

A significant factor that affects the fault propagation is the examination of reconvergent pulses. This tool takes into account SETs following multiple paths that may reconverge at a subsequent gate. Thus, when two or more pulses of the same SET reconverge at a cell having the same direction (Figure 4-2(a)), the output pulse is approximately equal (due to the different rise and fall times) to the overlapping period. On the other hand, as for the overlapping pulses with opposite direction (Figure 4-2(b)), the resulting pulse at the gate output depends on its type and controlling value. Presenting the simulation of such a case for NOR2 gate, its controlling value is logic 1 and the output pulse equals to the period between the moment that the first pulse falls below Vdd/2 and the moment that the second one rises above Vdd/2. For the



Figure 4-2: Output pulses for (a) same direction, (b) different direction, and (c) non-overlapping input reconvergent pulses.

non-overlapping case, as SPICE simulation shows (Figure 4-2(c)), both pulses emerge at the output. However, in order to model this case in the proposed framework, for the sake of simplicity, only the greater pulse is taken into consideration.

## 4.2.3 Timing issues

### Gate delay

A SER estimation process requires a basic timing analysis of the circuit. On the one hand, the propagation delay of the generated SET until the memory elements and outputs should be calculated for the modeling of electrical and timing masking. On the other hand, it is necessary to obtain the circuit critical path, so as we are able to determine the minimum circuit period. Therefore, the accuracy of the timing analysis that is adopted for the timing characteristics of the circuit contributes to the accuracy

of the SER estimation. In this dissertation two approaches are taken into account to calculate gate delay: the plain and straightforward method of logical effort [81] and the more accurate and complicated method utilizing a Non-Linear Delay Model (NLDM).

The logical effort is a simple technique that calculates the gate delay in terms of the RC constant  $(\tau)$ , that is the delay of the ideal inverter. The gate delay is expressed as the sum of the gate's parasitic delay (p) and stage effort (f), whereas the stage effort is the product of the logical effort (g) and the electrical effort (h). The ideal Inverter is considered the reference gate to obtain all these individual parameters for the rest of the gates, rendering this method quite simple as it is adequate to know the sizes of the gates relative to the Inverter size. Except for its simplicity, this method is considered technology independent, since the only parameter that changes across technologies is the time constant  $\tau$ , which can be easily obtained through electrical simulations.

However, this technique neglects the different input signal slopes (edge rates) as well as the wide range of the output capacitive loads, thus resulting in inaccurate estimation of gate's delay. To incorporate a more accurate timing model into the SER estimation tool that corresponds to the results of the state-of-the-art timing analysis EDA tools, the gate delay is calculated taking into account the NLDM as this is described in the CMOS technology library. This model consists of a set of pre-characterized Look-Up-Tables for each logic cell, which store the rise and fall propagation delays based on indicative values of the input transition time and the output load capacitance. If a specific input slope or output capacitance of a gate does not match with the indices of the LUT, interpolation is utilized to obtain the corresponding delay value. Also, note that each library includes such tables for different corners, such as for typical, fast, slow or worst case conditions.

#### Interconnection delay and propagation

Another critical issue regarding the performance of modern CMOS circuits is the interconnect wiring between the components (e.g., logic cells, logic blocks). The in-

terconnects introduce parasitic quantities of resistance (R), inductance (L) and capacitance (C) which may affect the propagation delay. Therefore, various approximate techniques exist in order to model and estimate the interconnection delay, during the pre-layout phase, taking into account the number of net fan-outs and estimating its total wire-length. However, the actual interconnection network of a circuit can be obtained after the Placement and Routing (P&R) process with the extraction of its Standard Parasitic Exchange Format (SPEF) file. Such file represents the parasitic connection and may be further used for simulation purposes such as timing analysis.



Figure 4-3: Distributed RC interconnection tree.

Figure 4-3 shows an example of an actual RC network of a net, as described in the SPEF file of a design. This is a distributed net model and is depicted as an RC-tree with 2 branches. Given an RC network like the one of Figure 4-3, it is trivial to compute the interconnection delay according to Elmore's delay model [82], which is, however, beyond the purpose of the current work.

To accurately estimate SER, it is crucial to take into account the effect of interconnection parasitics on pulse propagation. Therefore, to incorporate the interconnection network into the SER estimation tool, a SPEF file parser was implemented to account for each net parasitics, thus estimating their delay. Moreover, for each net, the pulse width at the output of a gate is transformed to a new one at the inputs of the fanout gates, taking into account the current parameters, such as slew and total wire capacitance. Thus, a detailed modeling of the interconnection network is accounted

for the SER estimation.

### 4.2.4 Failure probability calculation

As mentioned, this Monte Carlo-based SER estimation tool conducts repeated simulations taking into account different parameters each time. These parameters include the gate that a fault is injected, the primary input logic vector for the logic simulation, the width of generated SET, and the strike moment within the clock cycle. The variety of the parameters and the significant number of simulations for the different values of these parameters provide sufficient data for an accurate SER estimation. A brief description of the basic simulation procedure to estimate the SER of a circuit is presented in Algorithm 1.

**Algorithm 1** Basic failure probability calculation algorithm considering SETs

Input: Circuit netlist file

Output: Circuit failure probability

- 1: Read the circuit netlist
- 2: for each logic gate do
- 3: **for** random primary input vectors **do**
- 4: Do a single logic simulation and record the logic values at FF inputs
- 5: **for** SET pulses of random shapes and at random time moments within the period injected on the gate **do**
- 6: Do a single logic and electrical masking simulation accounting for the SET and record its shape at FF inputs
- 7: Timing masking check by identifying if SET is latched at FFs
- 8: end for
- 9: end for
- 10: Calculate the total latching probabilities over the total number of simulations
- 11: end for
- 12: Calculate the circuit failure probability combining the weighted total latching probabilities of each gate

The execution starts by parsing the input file, which is a simple netlist of the circuit, and storing the details and the structure of the design in appropriate data structures. Then, a cycle of simulations is performed for each logic gate of the circuit, excluding the sequential cells. Each cycle includes, firstly, the assignment of random logic values to the primary inputs and FF outputs of the design and the logic

simulation considering these values to record the logic values at the FF data inputs (line 4). Subsequently, we inject SETs of random width and striking time moment on the particular gate and another simulation starts. However, the faulty logic value, the SET width and striking time are accounted for to perform the logical, electrical, and timing masking operations (lines 5-6). After completion of the last simulation, we are aware of the faulty and non-faulty logic values, the SET widths and the SET arrival times at the FF inputs. In view of these data that are obtained from a single simulation, we are able to calculate the latching probability at the FFs, that is, the probability that a SET will be latched by a FF. In particular, a SET is latched from a FF when none of the masking mechanisms occurs. Therefore, for all the FFs we check if the FF inputs are at an erroneous logic state (logical masking). Then we check which of the transient pulses at these inputs are wide enough to actually affect the FF input (electrical masking). Finally, we examine if the conditions 4.3(a) and 4.3(b) are not satisfied, which means that the SET is latched (timing masking). If there is at least one FF that satisfies these checks, a soft error emerges and the latching probability of the current simulation is one. Otherwise, it is equal to zero as shown below:

$$latched\_glitch = \begin{cases} 0, & \text{when } (4.3) \text{ is true} \\ 1, & \text{when } (4.3) \text{ is false} \end{cases}$$

$$(4.4)$$

Afterwards, a new simulation taking into account different parameters is made providing another latching probability. The total latching probability is the sum of the individual latching probabilities for all the simulations over the total number of simulations n as shown (line 10):

$$P_{latched}^{total} = \frac{1}{n} \sum_{i=1}^{n} latched\_glitch(i)$$
 (4.5)

Note that this calculation is made for each logic gate of the circuit. Therefore, we need to combine the total latching probabilities  $P_{latched}^{total}$  of each gate to calculate the circuit failure probability (line 12). However, the probability that a particle hits a

gate depends on the area that this gate occupies in the circuit, and thus a weighted equation incorporates this parameter given the number of gates g, as follows:

$$P_{fail} = \sum_{j=1}^{g} \frac{A_j}{A_{circuit}} P_{latched}^{total}(j)$$
(4.6)

where  $A_j$  is the area of gate j,  $A_{circuit}$  is the circuit area, and  $P_{latched}^{total}(j)$  is the total latching probability of gate j.

### 4.2.5 Gate sensitivity evaluation

This subsection presents a methodology to identify the sensitivity of the gates to radiation-induced faults. The motivation behind this analysis is that the knowledge of which gates are more sensitive to soft errors is necessary in the effort to reduce their effects on ICs. However, reducing the SER of a circuit through various hardening approaches comes with additional cost in terms of area, delay, and power consumption. In order to confine this overhead, it is common to harden the most vulnerable areas of the circuit instead of its entirety. The sensitivity of a logic gate corresponds to its relative contribution to the overall circuit SER and is obtained through several targeted simulations.

Intuitively, in combinational logic, a gate is considered sensitive when the probability of a generated SET during its propagation from the gate output to a memory element is not negligible. In such a case, the presence of the three masking effects that are able to mitigate a SET is vague. Therefore, the metric of the gate sensitivity is inversely proportional to the masking capability of all the three effects jointly. The *Glitch Latching Probability* (GLP) of each gate of a circuit is defined as the probability that a transient glitch at the gate output will propagate and be eventually latched by at least one memory element. A simplified variation of the aforementioned SER estimation methodology is followed to characterize the gate sensitivity. In particular, particle strikes of different widths are injected on each gate. Also, each one of the strikes is applied at different time moments during the clock period. Subsequently, a sufficient number of simulations are performed using differ-

ent primary input vectors. Performing these simulations under different parameters, we ensure that masking effects are sufficiently simulated. During the simulation, the generated pulse is subjected to these effects as it propagates through the circuit. The probability that all these faults are captured by at least one sequential element is obtained, assigning a sensitivity value to each gate, which is computed as follows:

$$GLP = \frac{1}{n} \sum_{i=1}^{n} latched\_glitch(i)$$
 (4.7)

where n is the total number of simulations that equals the product  $n = l \cdot e \cdot t$ , where l is the number of the different primary input vectors for the simulation of logical masking, e is the number of the different width pulses that are used, t is the number of the different constant times that errors occur within the clock period, and  $latched\_glitch$  equals one when a fault is latched by at least one memory element; otherwise is zero.

The large number of simulations, due to the different parameters used, as well as the complexity of the large-scale benchmarks, renders this process time-consuming, yet it provides a quite accurate assessment of the relative sensitivity among the gates of a given design.

# 4.3 Single Event Multiple Transients

As node technology keeps shrinking, following the Moore's Law, a significant problem emerges and raises concerns upon the susceptibility of ICs in radiation-induced hazards. In particular, a radiation particle that strikes on a point of a circuit, might provoke multiple transient faults on neighboring logic cells as a result of the reduction in silicon device feature sizes and the higher integration densities of modern ICs, as shown in Figure 4-4. A more detailed and comprehensive study of the behavior of such incidents should be taken into consideration. Thus, it is paramount that the SER estimation procedure needs to be reconsidered to account for the emerging threat of SEMTs. This section focuses on the critical parameters that render a SER estimation



Figure 4-4: Node density and SEMTs increase as technology downscales.

tool accurate and concludes with the overall SER estimation algorithm.

### 4.3.1 Sensitive regions

A critical aspect of a layout-based SER estimation is the determination of the sensitive regions of a gate. Suppose that a radiation particle strikes the silicon of an IC and, especially, a transistor (either an nMOS or a pMOS) of a logic cell. As mentioned, this strike could probably provoke a SET at the gate output, given that its energy is sufficient enough to generate a charge that exceeds the critical charge of the gate (charge collection mechanism). However, in such a case, a particular condition should be met so as the gate output logic value flips eventually. This condition is that the particle should strike on a sensitive region of the gate.

In general, the sensitive regions of a logic cell are regarded the inactive transistors and more specifically the drain of the inactive pMOS transistors and the channel region of the inactive nMOS transistors. On the other side, the source regions of the transistors are biased either to the supply voltage or to the ground, and thus they are not included in the charge collection mechanism. The active or inactive transistors are determined by the inputs at the gate of the transistors, and thus by the current logic gate inputs. That means that for a CMOS circuit the sensitive regions of a

particular gate are changing dynamically as the circuit operates. An example of sensitive regions is illustrated in Figure 4-5. In particular, an Inverter and a NAND2 gate, their schematic, and their physical layout indicating the sensitive regions, which are contingent on the input logic values are presented. Sensitive regions are regarded the drain of the pMOS (SR1) and the drain of nMOS transistor (SR2). When an ionizing particle strikes the drain of the pMOS of the Inverter, a SET is generated just in the case the transistor is inactive, that is, the input vector is (A) = (0). Thus, if the input vector is (A) = (1) no SET emerges at the output. Regarding the NAND2 gate, there are 3 sensitive regions, the pMOS drain (SR1), the upper nMOS drain (SR3) and the source/drain junction of the two nMOS (SR2), as shows Figure 4-5(b). Accordingly, when a particle strikes the pMOS drain, a SET is induced only when both transistors are inactive, hence for the (A1, A2) = (0, 0) input vector. Note that the sensitive regions of any logic cell, even for the complex AOI and OAI cells, can be identified through device-level or circuit-level simulations.



Figure 4-5: Illustration of the sensitive regions upon the physical layout and their dependency on the logic input values of two basic logic cells, (a) Inverter and (b) NAND2.

### 4.3.2 Multiple site identification

One of the most important steps in a SEMT analysis is to identify the gates that are affected from a single particle strike, in other words, the neighboring gates around the point where the particle struck. For the extraction of the physically adjacent gates, previous approaches attempted to identify them examining the gate fan-outs and fan-ins from the logic-level netlist [52, 53]. However, this assumption may result in the overestimation or underestimation of SER, since the actual neighboring gates may differ.

In corroboration of the previous inference, consider the small circuit segment illustrated in Figure 4-6. There are two representations of the particular segment. The first, in Figure 4-6(a), is a schematic that describes the connections among the logic gates and the registers, whereas the second, in Figure 4-6(b), demonstrates the placement of the logic standard cells in the rows, which are created during the floorplanning stage. Also, suppose that a high-energy particle strikes the NOT1 gate. According to fan-ins of the gate, the NOR2 gate should be regarded as adjacent gate since its output is the input of NOT1. However, the topological information obtained from the DEF file show that NOR1 gate is not adjacent to NOT1. To the contrary, NAND2 gate is adjacent, and thus a particle striking NOT1 gate could potentially affect it as well, as Figure 4-6(b) shows. Similarly, a particle strike on gate NOT2 could probably affect NOR2 gate, since they are adjacent. Therefore, considering only logic-level netlists during the analysis leads to inaccurate estimation of SER, since, with this method, the fan-outs and fan-ins of the striken gate may mark as neighboring gates those that do not belong to the same error sites, according to the actual layout.

Therefore, it is crucial to take into account the actual placement of the logic cells on the chip's core area. Each circuit physical layout is completely described by the Design Exchange Format (DEF) file, along with the Library Exchange Format (LEF) file, in an ASCII format [83]. For the purpose of this dissertation only the DEF file is utilized in order to keep information about the circuit connectivity and the



Figure 4-6: Juxtaposition of (a) netlist-based and (b) layout-based identification of adjacent gates revealing the inaccuracy of the former approach.

placement of its components. A typical part of a DEF file is presented in Figure 4-7. In particular, the COMPONENTS block includes all the logic cells of the design and their coordinates and direction on the circuit die, whereas the NETS block describes the nets connection to the logic cells as well as their routing.

```
COMPONENTS 205;
                                              NETS 140 ;
                                               - G21
- NOT 28 INV X4 + PLACED ( 3800 23000 ) N ;
                                                  ( NOT 42 A )
- NOT 27 INV X2 + PLACED ( 29640 3400 ) FS ;
                                                  ( NOT 25 A )
- NOT 26 INV X2 + PLACED ( 29640 6200 ) N ;
                                                  ( DFF 11 Q )
- NOT 25 INV X1 + PLACED ( 5700 23000 ) N ;
                                                  + ROUTED metall ( 190 24260 ) vial_5
- DFF 13 DFF X1 + PLACED ( 11780 25800 ) FS ;
                                                  via2 1 W
- DFF 12 DFF X2 + PLACED ( 8360 600 ) N ;
                                                  vial 5
- DFF 11 DFF X1 + PLACED ( 0 20200 ) FS ;
                                                  NEW metall ( 6270 22300 ) vial_5
- DFF 10 DFF X1 + PLACED ( 5320 25800 ) FS ;
                                                  (5890 *) ( * 24260 )
- NOR3 4 NOR3 X2 + PLACED ( 11020 20200 ) FS ;
                                                  + USE SIGNAL
- NOR3 3 NOR3 X1 + PLACED ( 16340 600 ) N ;
                                                                    (b)
                     (a)
```

Figure 4-7: Snippet of (a) COMPONENTS and (b) NETS block statements of a DEF file.

In view of the circuit physical layout, the modeling of the SEMT identification is the following step. A particle striking a point on the silicon inflicts a disturbance on an area around it, which depends mostly on the energy that this particle delivers and the angle of the strike. Various works studied the cross-section of radiation particles striking the semiconductor. This dissertation takes into account the results of the works presented in [84, 56], which specify the average area that particles of different energies affect, as shows Table 4.2. The affected area, which may be depicted with an oval shape presents a non-linear increase as the energy increases. The higher the amount of energy is, the wider the area of the circuit that is affected by the strike. Note that these values may be applied on different technologies due to their dependence on the properties of the particle as such.

Table 4.2: Average affected area

| Particle Energy (MeV) | Average Affected Area ( $\mu m^2$ ) |
|-----------------------|-------------------------------------|
| 22                    | 1.178                               |
| 47                    | 1.902                               |
| 95                    | 2.903                               |
| 144                   | 4.613                               |

As mentioned in the previous section, the sensitive regions of a logic cell are regarded the inactive transistors and particularly, the inactive transistor diffusions that are not biased to Vdd or Gnd. Therefore, it is necessary to account for the position of the transistors on the chip's die. The GDSII is a binary file which represents layout geometrical data, such as transistor diffusion, poly, contacts, metals, vias, etc., in a hierarchical format. Parsing the GDSII files of the logic cells provided from the standard cell library, along with the DEF file of a circuit, we are able to obtain the exact position of each one of the transistors of the utilized logic cells on the die area. In particular, parsing the GDSII file of each cell we obtain the coordinates of the diffusions regarding as reference point the bottom-left corner of the cell. Next, we need to map these coordinates to the actual coordinates on the design by utilizing the DEF file. A key point in this mapping is that the standard cells are not placed in the rows of the floorplan with the same orientation. On the contrary, during the placement process the cells are placed with the orientation that optimizes the performance of the design. The DEF file specifies the orientation of each cell, which can be one of the following: N (North), S (South), W (West), E (East), FN (Flipped-N), FS (Flipped-S), FW (Flipped-W), or FE (Flipped-East). Also, specifies the coordinates of the bottom-left corner of the cell on the die area, as shown in Figure 4-7(a). Thus, combining this information we are able to identify the position of the diffusions on the circuit die. Figure 4-8 presents a N-oriented standard cell and the diffusion areas as those are obtained from the GDSII file. Two different orientations (W and FS) of the same cell are also illustrated indicating how a specific point (x, y) of the cell is mapped to the other orientations. Then, given the coordinates of the cells on the die, it is trivial to map the diffusion coordinates to the actual layout.



Figure 4-8: Mapping of the diffusion areas depending on the standard cell orientation.

Note that for the sake of simplicity we consider that the sensitive regions are the n and p diffusions of the transistors, including both the drain and source regions regardless of whether they are biased or not.

For an accurate SEMT identification there are two steps that should be followed. The first is to identify the sensitive transistors of each logic gate, i.e., the nMOS or pMOS transistor diffusions. This can be merely done by observing, during the logic simulation, which transistors are inactive, based on the current logic values. Whereas the second is to identify which of the inactive transistors are located within the influence range of the particle strike as this is determined by the oval shape. Thus, multiple transient glitches emerge at the output of the affected gates initiating the simulation for SER estimation. It should be noted that this is a dynamic process, as the strikes on a pMOS or nMOS determines the SET direction (i.e.,  $0\rightarrow 1\rightarrow 0$  or  $1\rightarrow 0\rightarrow 1$ ).

#### 4.4 Overall Soft Error Rate Estimation Flow

In this section, we describe the integrated framework for the SER estimation in the combinational logic of ICs, taking into consideration radiation-induced SEMTs, through a comprehensive algorithm and a detailed flow chart.

Even though the SER estimation process is quite similar for both SET and SEMT analysis, there are some variations that should be highlighted. An apparent difference is the SEMTs consideration, which means that more than one transient pulse propagates at the same time through the circuit. Thus, first, we need to identify the SEMTs, which occur due to a single ionizing particle strike. Algorithm 2 presents the procedure for the fault injection and the SEMT identification and generation that precedes the propagation phase.

**Algorithm 2** Fault injection and identification/generation of SEMTs

```
1: function Error Generation
       SEMT list \leftarrow initially empty list
       x_{hit} \leftarrow \text{random point on x-axis of the circuit die where particle strikes}
 3:
       y_{hit} \leftarrow \text{random point on y-axis of the circuit die where particle strikes}
 4:
 5:
        E \leftarrow \text{random particle energy}
       r \leftarrow \text{Compute the radius of the particle's cross-section according to } E
 6:
 7:
       for all logic gates G_i of the design do
           if distance between nmos/pmos diffusion of G_i and (x_{hit}, y_{hit}) < r then
 8:
               if affected nmos/pmos is a sensitive region then
 9:
                   SEMT list \leftarrow G_i
10:
               end if
11:
           end if
12:
           Generate and initialize SEMTs (pulse width, faulty logic state and oc-
13:
    curence time within clock period)
       end for
14:
15: end function
```

More specifically, an ionizing particle striking on the circuit is considered to be random; hence, a random point on the circuit die and a random particle energy are selected to simulate this event (lines 3-5). Note that the particle energies and the corresponding affected areas are obtained from Table 4.2. Then, the radius of the oval-shaped area is calculated and all the cells of the circuit, except for the FFs, are examined successively. In light of the position of the nMOS and pMOS diffusions

on the circuit die, obtained from the DEF and GDSII files, we check if they are located within the range of the particle (line 8). If this happens, we check whether the affected nMOS or pMOS are sensitive, i.e., are inactive, depending on the gate logic inputs. The current gate is inserted in the list of the affected gates, if these checks are true (line 10). This procedure is followed for each gate providing a SEMT list. Upon identifying the affected gates, each SEMT is initialized at the gate output with respect to its erroneous logic state, pulse width and onset time of the event.

From implementation point of view and as regards the masking mechanism modeling, the simultaneous fault propagation requires to utilize three two-dimensional (2D) arrays—one for each masking mechanism—so as every fault propagates separately from the others and along with its own masking information. Note that their size changes dynamically and depends on the number of transient faults generated from the particle strike. Thus, at the end of each simulation, multiple checks are conducted at the FF inputs to determine the faults that will be captured by the memory elements. However, in this case, just one of the multiple faults being latched is sufficient for the emergence of a soft error. In Algorithm 3, we present all the steps of the SER estimation framework considering that SEMTs may emerge from a single ionizing particle.

The SER is defined as the rate at which soft errors emerge or is predicted to emerge in a device or system and is typically measured either in FIT (Failures In Time), which is equivalent to the number of failures per one billion hours of operation, or in MTTF (Mean Time To Failure), which denotes the time of operation until a failure occurs. However, the expression of the SER in terms of FIT prevails in the literature and is widely used in semiconductor industry due to its efficacy in ICs susceptibility evaluation.

As soon as the overall SER probability of the circuit is obtained, we are able to calculate the SER in terms of FIT, given the environment of the estimation as well as the circuit die area, as Equation 4.8 shows:

$$SER_{FIT} = F \cdot A \cdot SER_{prob} \tag{4.8}$$

#### **Algorithm 3** SER calculation

Input: Circuit DEF file, Logic cells' GDSII files, NLDM liberty file (optional)
Output: Circuit failure probability and SER in terms of FIT & various experimental results

```
1: function SER ESTIMATION
        for injected fault i do
 2:
             SEMT list \leftarrow ERROR GENERATION()
 3:
             for each input vector do
 4:
                 function CIRCUIT SIMULATION
 5:
                     w[SEMT \ list] \leftarrow \text{propagated SET width (electrical masking)}
 6:
                     d_{prop}[SEMT\_list] \leftarrow \text{propagation delay of SET (timing masking)}
 7:
                     for each flip-flop F do
 8:
                         if error not logically or electrically masked then
 9:
                              if latching condition of all errors of SEMT list is true then
10:
                                  latched \leftarrow 1
11:
                                  Break and examine another input vector
12:
                              else
13:
                                  latched \leftarrow 0
14:
                              end if
15:
                          end if
16:
                     end for
17:
                     overall\ latched \leftarrow overall\ latched + latched
18:
19:
                 end function
             end for
        P_{fail}^{i} \leftarrow \frac{overall\_latched}{\#ofinput vectors} end for
20:
21:
22:
        SER_{prob} \leftarrow \frac{\sum^{i} P_{fail}^{i}}{i} \\ SER_{FIT} \leftarrow F \cdot A \cdot SER_{prob}
23:
24:
25: end function
26: Export various experimental results regarding circuit reliability
```

where F is the neutron flux, A is the area of the circuit under test, which is exposed to the flux, and  $SER_{prob}$  is the SER probability as obtained previously [85].

To provide a comprehensive view of the implementation part of this thesis, we present the overall flow in Figure 4-9. In particular, the flow is divided into three different parts that are interdependent. First of all, the design flow describes a basic methodology to obtain the files that will be the inputs to the other flows, given an RTL design for a specific technology library. This flow includes the synthesis of the initial design as well as its placement and routing with the appropriate EDA tools. Taking into consideration the generated DEF file, the SER estimation flow



Figure 4-9: Overall flow of the SER estimation framework.

initiates by parsing it, and creates the topology of the design that is going to be utilized throughout the process. Before proceeding to the SER estimation process, it is necessary to perform a timing analysis of the circuit. Generally, there are two approaches. The first is to incorporate the results of an industrial timing analysis tool (e.g., Synopsys<sup>®</sup> PrimeTime<sup>™</sup>), which is an accurate but costly solution. The second is to implement an in-house basic timing analysis that will serve the purpose of this dissertation, which was adopted ultimately. Besides, it can be enhanced with the utilization of state-of-the-art work from the literature, such as [86], which adopts Current Source Models (CSMs) to accurately estimate gate delay. Then the SER estimation flow resumes by injecting random ionizing particles on the circuit layout and identifying the SEMTs that are potentially generated. The initialization phase

of the fault (i.e., faulty logic value, SET width and time of the incidence within the period) and the logic input vector follows for a single simulation. Each simulation cycle includes the masking mechanisms, which are essentially the determinant factors for a soft error to emerge. Finally, an estimation of the design's SER is provided along with various results regarding its susceptibility to ionizing radiation.

# Chapter 5

# Experimental Results

This chapter presents the procedure for the verification of the SER estimation tool in combinational logic utilizing a subset of ISCAS '89 benchmark circuits. Various experimental results are also demonstrated on the whole benchmark suite, evaluating the susceptibility of the circuits to radiation-induced soft errors. A variation of the basic SER estimation process is presented also to deal with the highly time-consuming Monte Carlo simulations.

## 5.1 Soft Error Rate Verification

The validation of a SER estimation tool, like the one described in this dissertation, is an indispensable process in order to quantify its accuracy and, subsequently, through calibrations and optimizations achieve a better result in terms of accuracy. Traditionally, the verification is conducted through SPICE-like software, or other electronic circuit simulators. However, there are some difficulties in this process. First of all, it is inevitable that there will be discrepancies regarding the results between the electrical-level simulation (SPICE) and logic-level simulation (SER estimation tool), since the former is an integrated program that takes into account several electrical parameters to perform the simulations, whereas the latter constitutes a simplified high-level approach, having its advantages, though. Besides, the SPICE simulation is an extremely time-consuming process when it comes to large-scale circuits, rendering

it disadvantageous in terms of execution time. In fact, it could take many hours, even days, for the simulation of a design consisting of some thousand or million of nodes to complete. Therefore, the development of such logic-level simulators that are orders of magnitude faster and, at the same time, approximate the accuracy of SPICE-like simulators is imperative.

A critical factor that worsens the problem of the high execution times is the number of fault injections. A variety of faults should be injected and applied on every node to excite the entire circuit and achieve a reliable outcome. However, this would result in high cost in terms of time even for the small circuits, hence there should be a tradeoff between execution time and accuracy by adjusting the total number of faults injected. In the context of this dissertation, the verification was performed with Synopsys<sup>®</sup> HSPICE<sup>™</sup>, whereas only the small-scale ISCAS '89 benchmark circuits were utilized (i.e., s27, s298, s344, s349). Otherwise, the simulations would have needed a vast amount of time to complete. Table 5.1 presents the benchmarks that are used and the number of faults injected for the SER verification.

Table 5.1: Number of faults injected for each benchmark

| Benchmark | Number of Nodes | Number of Faults |
|-----------|-----------------|------------------|
| s27       | 17              | 251              |
| s298      | 169             | 158              |
| s344      | 240             | 94               |
| s349      | 224             | 65               |

The simulation setup for the verification includes the following. First, the netlist is created with the logic cells and the transistor models obtained from the 45nm Open Cell Library [87]. A voltage source pulse is applied to each primary input in a way so as the 2nd input has double pulse width and period compared to the 1st input, the 3rd input has double pulse width and period compared to the 2nd and so on. Thus, all different input combinations of logic 0 and logic 1 are covered. Also, the period of the clock pulse that is connected with the FFs is set to the maximum circuit propagation delay as this is obtained from the critical path extraction. A single transient

analysis follows to record the FF outputs on every time moment. Subsequently, a specific number of faults are injected on the output of all the logic cells and at random moments within the clock period, performing a simulation after each injection. Finally, a simple script is utilized to compare the results of the faulty simulations with the initial simulation and calculate the overall circuit failure probability. The flow of the HSPICE simulation methodology is presented in Figure 5-1.



Figure 5-1: HSPICE simulation flow for verification.

Table 5.2 summarizes the verification results on the examined benchmarks as well as the number of nodes indicating a high accuracy.

To be able to obtain an accurate SER estimation, the modeling of the basic masking mechanisms should be accurate as well. Therefore, to verify both the electrical and timing masking that are implemented from the SER estimation tool, we extract different logic paths with respect to number and type of logic gates from various benchmark designs. Subsequently, each path is imported to HSPICE, a SET pulse

Table 5.2: Verification results of failure probability

| Bench.  | Number   | Our tool |          | HS            | PICE     | Accuracy  |  |
|---------|----------|----------|----------|---------------|----------|-----------|--|
| Deffer. | of Nodes | Time     | SER      | Time SEF      |          | riccuracy |  |
| s27     | 17       | 1s       | 0.152721 | $4\mathrm{m}$ | 0.159735 | 95.6%     |  |
| s298    | 169      | 1s       | 0.097645 | 17m           | 0.104773 | 93.2%     |  |
| s344    | 240      | 1s       | 0.131412 | 35m           | 0.141922 | 92.6%     |  |
| s349    | 224      | 1s       | 0.190249 | 42m           | 0.170844 | 89.8%     |  |

is applied on the input of the first gate, and a simulation is performed to obtain the pulse width and overall path delay at the output of the last gate. Note that the other gate inputs are in non-controlling value to impede logical masking occurs. Similarly, the selected paths are simulated with the SER estimation tool applying a SET of the same width as in HSPICE and observing the shape of the pulse at the end of the path. Table 5.3 demonstrates the paths that are checked, the length of each one, and the output SET pulse width and propagation delay in picoseconds for both SPICE and our tool's simulations. The average accuracy of the proposed approach hovers at about 91%, which is acceptable considering the difference in execution time, since HSPICE simulation is much more time-consuming than the simulation with our tool.

Table 5.3: Comparison of the proposed electrical and timing masking models with HSPICE on various SET pulse propagation paths

| Path   | Gate   | Electrical |      |       | Timing |      |       |  |
|--------|--------|------------|------|-------|--------|------|-------|--|
| 1 aun  | Stages | Spice      | Tool | Acc.  | Spice  | Tool | Acc.  |  |
| Path 1 | 5      | 201        | 212  | 94.5% | 171    | 183  | 93%   |  |
| Path 2 | 8      | 198        | 189  | 95.5% | 238    | 258  | 91.6% |  |
| Path 3 | 13     | 204        | 215  | 94.6% | 394    | 425  | 92.2% |  |
| Path 4 | 20     | 195        | 187  | 95.9% | 685    | 749  | 90.7% |  |
| Path 5 | 25     | 197        | 185  | 93.9% | 849    | 946  | 89.9% |  |
| Path 6 | 36     | 208        | 221  | 93.8% | 1282   | 1379 | 92.4% |  |

# 5.2 Experimental Results

#### 5.2.1 Experimental setup

The proposed SER estimation framework was developed in C programming language and all the experiments were performed on the same Linux-based workstation with an Intel<sup>®</sup> Core<sup>™</sup> i7-3770 Processor @3.4GHz and 8GB RAM. In order to demonstrate the circuit vulnerability evaluation, we considered the whole ISCAS '89 benchmark circuits, which are designs consisting of combinational logic and storage elements (i.e., D-type FFs). All the designs were synthesized with Synopsys<sup>®</sup> Design Compiler<sup>™</sup> and placed and routed with Cadence<sup>®</sup> Innovus<sup>™</sup>, both with respect to the widely-utilized 45nm and the more recent 15nm Open Cell Libraries [87].

# 5.2.2 Impact of the electrical and timing masking modeling on the SER

A determinant factor for the evaluation of the SER is the modeling of the masking mechanisms. Since the logical masking is quite trivial to model, the approach regarding the modeling of the other mechanisms may affect significantly the SER evaluation. We compared the failure probabilities of some benchmarks—for both 45nm and 15nm technologies—taking into account the two electrical masking approaches. The corresponding results are shown in Table 5.4. The first approach is based on the closed-form expression that takes into account the inertial gate delay, whereas the other utilizes the actual delays, obtained from the NLDM-based STA, and considering the actual transitions of the SET pulses as they propagate through the gates. The results show that, generally, the failure probabilities are greater when the NLDM-based approach is taken into account. This is justified from the fact that the SET pulse width may broaden—depending on the rising and falling delays of each gate—as it propagates. The analytical expression of the first approach, in contrast, considers that the SET pulse width may only be reduced, if not remain unchanged. Also, comparing the results for the different technologies, the failure rate is higher

Table 5.4: Failure probabilities considering the closed-form approach and the NLDM-based approach for electrical masking and for both 45nm and 15nm technologies on a subset of ISCAS '89 benchmarks

|           | Failure Probability |            |             |            |  |  |  |  |
|-----------|---------------------|------------|-------------|------------|--|--|--|--|
| Benchmark | 45                  | nm         | 15          | 5nm        |  |  |  |  |
|           | Closed-form         | NLDM-based | Closed-form | NLDM-based |  |  |  |  |
| s27       | 0.1293              | 0.1527     | 0.2134      | 0.2917     |  |  |  |  |
| s344      | 0.0943              | 0.1314     | 0.1345      | 0.1842     |  |  |  |  |
| s641      | 0.0215              | 0.0528     | 0.0541      | 0.0971     |  |  |  |  |
| s9234     | 0.0302              | 0.0718     | 0.0803      | 0.1103     |  |  |  |  |
| s13207    | 0.0189              | 0.0423     | 0.0442      | 0.0747     |  |  |  |  |
| s15850    | 0.0103              | 0.0413     | 0.0329      | 0.0781     |  |  |  |  |
| s35932    | 0.0015              | 0.0069     | 0.0047      | 0.0098     |  |  |  |  |
| s38584    | 0.0039              | 0.0084     | 0.0089      | 0.0182     |  |  |  |  |

at 15nm for both approaches. This result is mainly due to the reduced device feature size, which induces more SEMTs compared to the 45nm technology. Although the SET pulse width is smaller when an ionizing particle strikes a FinFET device, which is the case for the 15nm technology, the decrease in the delay of the gates, and thus the increased operating frequencies, along with the increased number of SEMTs, result in higher failure probabilities.

Table 5.5 reports the failure probabilities for three different cases utilizing the 45nm and 15nm technology. In the first, the Logical Effort (LE) technique is taken into account to estimate gate delays, whereas a STA, based on an NLDM is implemented and used in the second case. In the last case, an enhanced timing analysis, which incorporates an RC interconnection model to account for the parasitics delay is implemented. According to the experimental results, in most of the designs, the SER decreases when the NLDM and RC I/C approaches are considered. This is explained by the fact that LE is an approximation method to estimate gate delay, taking into account transistor widths and lengths as well as the number of fan-outs and inputs, albeit neglecting the input transition times and the actual total output load capacitance. Therefore, the gate delay is overestimated, compared to the other

Table 5.5: Failure probabilities considering LE, NLDM and RC Interconnection approaches for both 45nm and 15nm technologies on a subset of ISCAS '89 benchmarks

|           | Failure Probability |        |        |        |        |        |  |
|-----------|---------------------|--------|--------|--------|--------|--------|--|
| Benchmark |                     | 45nm   |        | 15nm   |        |        |  |
|           | LE                  | NLDM   | RCI/C  | LE     | NLDM   | RCI/C  |  |
| s27       | 0.3230              | 0.1846 | 0.1527 | 0.2808 | 0.4216 | 0.2917 |  |
| s344      | 0.2086              | 0.1602 | 0.1314 | 0.0901 | 0.2556 | 0.1842 |  |
| s641      | 0.0499              | 0.0974 | 0.0528 | 0.0384 | 0.1826 | 0.0971 |  |
| s9234     | 0.0632              | 0.0956 | 0.0718 | 0.0547 | 0.1641 | 0.1103 |  |
| s13207    | 0.0445              | 0.0515 | 0.0423 | 0.0526 | 0.1107 | 0.0747 |  |
| s15850    | 0.0415              | 0.0659 | 0.0413 | 0.0345 | 0.1015 | 0.0781 |  |
| s35932    | 0.0044              | 0.0121 | 0.0069 | 0.0104 | 0.0227 | 0.0098 |  |
| s38584    | 0.0113              | 0.0137 | 0.0084 | 0.0094 | 0.0379 | 0.0182 |  |

models, resulting in smaller period, which eventually increases the SER, as it is more probable for a glitch to be latched during the latching window. As expected, the SER is further reduced when interconnection delay is considered, since the critical path delay, and thus the circuit period increases but also SET propagation delay increases. Comparing the results across the technologies, as expected and explained previously, the failure probabilities are increased for the 15nm technology and for the NLDM and RC I/C approaches. As regards the LE approach, the difference in the failure probabilities can be explained from the differentiated design resulted from the logic synthesis. The SER results on the whole ISCAS '89 suite obtained from our tool, incorporating the NLDM-based and the RC I/C models for the electrical and timing masking, respectively, in terms of FIT and for both technologies are reported in Appendix A.

### 5.2.3 Impact of the masking mechanisms on SET propagation

A determinant factor for the SER estimation of a circuit are the masking mechanisms. Their impact on the vulnerability of various designs is presented through the following experimental results. Since the masking mechanisms are contingent on the

connectivity and the design properties of the individual circuits, their impact varies from one circuit to another. Figure 5-2 presents the impact of each one of the masking mechanisms on the reliability estimation of some ISCAS '89 benchmarks and, particularly, the percentage of the SETs that were masked (logically, electrically or timingly) and not masked when the same number of faults were injected for both 45nm and 15nm technologies. Moreover, the respective failure probabilities are presented to elaborate the results of the masking mechanisms. What is evident is that for both technologies the logical masking is the predominant factor that contributes to the SER mitigation, whereas it hovers around the same percentages, which is reasonable due to the similar logic-level circuit implementation during synthesis. However, the timing masking in 15nm technology is less effective compared to the 45nm technology, which is explained from the higher operating frequencies that increase the probability of a SET to be latched from a FF. Also, note that in 15nm technology the overall masking ability of the benchmarks is reduced, mainly due to the existence of more SEMTs, which is something that justifies the higher failure probabilities. As regards the percentages across the benchmarks, the smaller benchmarks are less probable to mask, in whichever mechanism, the generated SETs, which reflects negatively in the failure probabilities, as shown in Figure 5-2(c).

# 5.2.4 Evaluation of the effect of SET and SEMT consideration on the SER

To corroborate the significance of SEMT consideration for a SER estimation analysis, we performed two types of simulations evaluating and comparing the failure probabilities of some benchmarks. For the first simulation, we considered that a particle strike may affect no more than one gate producing one SET at most, whereas for the second, a particle strike may induce multiple SETs on neighboring gates. The same simulations were performed for both 45nm and 15nm technologies to assess the failure probability trend as technology downscales. Figure 5-3 illustrates the experimental results of the failure probabilities for each case. As expected, the failure probabilities



Figure 5-2: Impact of the masking effects on the propagation of radiation-induced SETs through some benchmark circuits for (a) 45nm and (b) 15nm technologies, and (c) the corresponding failure probabilities.

are higher for both technologies when SEMTs are taken into account, which denotes that neglecting them results in underestimation of the SER. When SETs are considered solely, the failure probabilities at 15nm are slightly lower due to the smaller pulse width that an ionizing particle may potentially induce, providing that this is a FinFET-based technology. Conversely, the failure probabilities are substantially higher when SEMTs occur, due to the higher integration density, which induces a greater number of SEMTs. Note that most of the benchmarks are in accordance with this outcome, except for the s35932 benchmark that due to the small failure probability there is no explicit differentiation among the four cases.



Figure 5-3: Failure probabilities of the benchmarks considering SETs and SEMTs for both 45nm and 15nm technologies.

The number of SEMTs induced from an ionizing particle strike is decisive for the circuit reliability evaluation, and thus for the development of SER mitigation techniques. Figure 5-4 provides an overview of the distribution of a certain number of faults injected during a simulation for some benchmarks for both 45nm and 15nm technologies. In particular, the percentage of the injected faults that induced SETs and SEMTs are reported, along with the percentage of the injected faults that were latched from a memory element—producing eventually a soft error—and the obtained failure probability for each benchmark. Regarding the 45nm technology, the results show that the SEMTs percentage is only slightly greater than the corresponding for

the SETs for the majority of the benchmarks. On the contrary, in 15nm technology, the SEMTs are much more than the SETs, which is justified from the decreased transistor size. Besides, it can be observed a substantial increase in the percentage of the latched faults that eventually exacerbate the tolerance of the circuits to radiation hazards significantly, according to the reported failure probabilities.



Figure 5-4: Percentage of injected faults inducing SETs or SEMTs and being latched by a FF, and the corresponding failure probability of some benchmarks for (a) 45nm and (b) 15nm technology.

#### 5.2.5 Circuit site and gate vulnerability evaluation

To identify the most vulnerable sites of a circuit to radiation-induced transient faults, we performed a topological analysis, which is a variation of the basic SER estimation framework. In particular, given a physical circuit layout, as defined by the DEF file, we divided it into several smaller equal parts—called so on grids—which are regarded as subcircuits. Note that the number of grids may differ depending on the intended level of granularity, although there is an upper bound on this number, which is regulated by the circuit size, since for very small grids the extracted data may be misleading, thus resulting in an inaccurate SER evaluation. Afterwards, a certain number of particles of various random energies were injected on each grid inducing different number of errors, since each subcircuit contains its own set of logic cells placed in this area. Then we applied the main SER estimation process for each grid obtaining the individual SER, which indicates the susceptibility of this circuit site to radiation hazards.

Figure 5-5 shows the SER hotspot analysis of the s15850 and s35932 benchmark circuits for both 45nm and 15nm technologies. The die area of the s35932 is over three times as large as the die area of the s15850 for both technologies. Therefore, we divided the former into 100 grids, whereas the latter into approximately 1000 grids to provide an accurate overview of the most vulnerable areas. We notice that some grids (i.e., those with the deep red and orange colors) are considerably vulnerable compared to others, thus facilitating the designers to intervene and reconsider the design into these regions by applying minor modifications and selective radiation-hardening techniques to deal with the large SER values that result in the increase of the overall circuit SER. Apparently, intervening in a particular grid, one or more grids in its vicinity may potentially deteriorate in terms of SER. What is more, these calibrations may impact on the Power, Performance and Area (PPA) variables of the design. Thus, a compromise should be achieved between SER reduction and PPA worsening. Comparing the results across the technologies, we observe that in 15nm—for both circuits—there are more hotspots, which basically results from the increased



Figure 5-5: Illustration of the SER hotspot circuit regions of  $\rm s15850$  and  $\rm s35932$  benchmarks for both 45nm and 15nm technologies.

number of SEMTs. Besides, the maximum reported individual SER value is greater in 15nm technology justifying the increased overall SER values as presented previously.

Except for the circuit site vulnerability identification, our methodology can be applied on each gate individually, thus obtaining gate vulnerability. Figure 5-6 demonstrates the gate sensitivity of some benchmarks with respect to the GLP values of the gates. In particular, two GLP thresholds were selected to distribute the gates in three sensitivity levels. For the small benchmarks, more than half of the total gates exceed the threshold of 0.2, i.e., GLP > 0.2, which means that a particle occurred on any of these gates is more likely to result in a soft error. On the other hand, for the large-scale benchmarks the percentage of the gates having GLP values less than 0.2

increases, which is reasonable taking into account that it is more probable for SETs to be masked following a potential long propagation path of such benchmarks, and thus resulting in decreased GLP values. The advantage that this method offers is that the gate sensitivity values may be exploited in order to harden the most vulnerable ones to succeed SER reduction. Note that the thresholds may vary, depending on the circuit complexity, in order to identify accurately the most vulnerable gates.



Figure 5-6: Distribution of the gates depending on their sensitivity (GLP values) for some benchmarks.

### 5.2.6 Temperature dependence of the SER

It is quite common for the modern ICs to operate in elevated temperature environments, which may potentially affect the SER values. Actual measurements of radiation-induced SETs have revealed a strong dependency of their width on the operating temperature, attributed mainly to the enhancement of the parasitic bipolar effect, according to [88]. In particular, the experiments have reported increased SET pulse widths up to 80% as the temperature increases from 25°C to 100°C. Also, the SET widths increase more with temperature, when they originate from pMOS particle strikes compared to nMOS particle strikes. As discussed, the SET pulse widths are determinant for the emergence of soft errors in combinational logic, and thus the consideration of the operating temperature contributes to the accurate evaluation of

SER. We performed different simulations for each benchmark to identify the impact of elevated temperatures on the failure probabilities. Figure 5-7 demonstrates the obtained failure probabilities of the benchmarks for different operating temperatures, i.e., 25°C, 50°C, and 100°C, indicating the substantial increase in the failure rates of all the benchmarks and, subsequently, the increased susceptibility to radiation-induced faults under such conditions. Note that the SET widths utilized for each experiment were obtained from several SPICE simulations by injecting particle strikes of various energies on every logic gate and for the respective temperatures.



Figure 5-7: Impact of operating temperature on the failure probabilities of some benchmarks.

### 5.2.7 Comparison among similar SER estimation approaches

Several studies exist in the literature that model the radiation-induced soft errors and provide various methodologies to estimate the SER of digital ICs. Summarizing the features of our SER estimation framework, in this section we present a qualitative comparison among some of the well-known approaches, which were cited previously in the Related Work chapter. Table 5.6 lists a number of characteristics that a SER estimator may incorporate and indicates which of them are implemented by each approach, juxtaposing them with the features of our SER estimation tool.

Table 5.6: Qualitative comparison of the proposed tool with other state-of-the-art SER estimation approaches

| Feature<br>Work                    | Combinational logic | Logical<br>masking | Electrical<br>Masking | Timing<br>Masking | SEMTs | Layout-aware | Monte Carlo | Charge coll. | Fault injection | Spice sim. | Reconvergent<br>pulses |
|------------------------------------|---------------------|--------------------|-----------------------|-------------------|-------|--------------|-------------|--------------|-----------------|------------|------------------------|
| Murley-<br>Srinivasan [27]         | ×                   | х                  | X                     | х                 | X     | 1            | 1           | 1            | 1               | X          | X                      |
| Cha et al. [30]                    | 1                   | 1                  | ×                     | 1                 | X     | X            | X           | 1            | 1               | 1          | X                      |
| Zhao et al. [31]                   | 1                   | 1                  | 1                     | 1                 | X     | X            | X           | X            | X               | 1          | X                      |
| Rajaraman<br>et al. [32]           | 1                   | 1                  | 1                     | 1                 | х     | X            | X           | Х            | X               | 1          | 1                      |
| Rao et al. [34]                    | 1                   | 1                  | 1                     | 1                 | X     | X            | X           | 1            | X               | 1          | X                      |
| Dhillon et al. [35]                | 1                   | 1                  | 1                     | 1                 | X     | X            | X           | X            | 1               | 1          | X                      |
| Wang-Xie [41]                      | 1                   | 1                  | 1                     | 1                 | X     | X            | X           | X            | 1               | 1          | 1                      |
| Zhang-<br>Shanbhag [43]            | 1                   | 1                  | 1                     | 1                 | Х     | Х            | Х           | 1            | 1               | 1          | 1                      |
| Miskov-Zivanov-<br>Marculescu [52] | 1                   | 1                  | 1                     | 1                 | 1     | X            | X           | 1            | 1               | 1          | 1                      |
| Fazeli et al. [53]                 | 1                   | 1                  | 1                     | 1                 | 1     | X            | X           | X            | 1               | X          | 1                      |
| Ebrahimi<br>et al. [56]            | 1                   | 1                  | 1                     | 1                 | 1     | 1            | Х           | Х            | 1               | Х          | х                      |
| Huang-Wen [57]                     | 1                   | 1                  | 1                     | 1                 | 1     | 1            | X           | 1            | 1               | 1          | 1                      |
| Our work                           | 1                   | 1                  | 1                     | 1                 | 1     | 1            | 1           | X            | 1               | 1          | 1                      |

An accurate SER estimation tool needs primarily to implement the masking mechanisms that inherently mitigate the SER of a circuit. Most of the presented approaches take into account the masking effects, which by itself increases the accuracy

of the tool. As mentioned, due to the technology downscaling the radiation-induced SEMTs have been more prevalent compared to the SETs, thus accounting for such events conforms with the technology scaling and device trends enhancing further the accuracy. There are some recent works, including ours, that model and estimate SER taking into account the emergence of the SEMTs. Considering the existence of SEMTs is not enhancing the overall SER estimation accuracy by itself, since the actual circuit layout should be taken into account. Including our work, very few approaches implement this feature.

## 5.3 Speed-up of SER Estimation

A drawback that comes with a Monte Carlo-based SER estimation approach is the inherent large execution times. This problem worsens as the size of the circuits increases. Besides, the technology shrinking that follows the Moore's Law has increased the number of transistors, and consequently the number of nodes in a dense IC, rendering such type of simulations time-consuming. Therefore, approaches that minimize the execution time making it reasonable should be applied.

Except for the scale of the modern ICs, the main factor that deteriorates this problem is the large number of simulations. There are a couple of individual parameters that may catapult so much the total number of simulations that the whole process becomes impractical. Some of these are the number of particle strikes occurred, i.e., the injected faults, the number of applied primary input vectors for the logic-level simulation, the number of different, in terms of energy, particles applied, and the number of different moments within the clock period that a particular particle is injected. In view of the vast number of simulations for the different values of the aforementioned parameters, it is inevitable to resort to tradeoffs between accuracy and runtime to address this problem.

An intermediate solution that can be applied to confine the total number of simulations, and thus reduce the execution time preserving the accuracy of the estimation is to inject faults iteratively, computing after each iteration the current failure probability. As long as the absolute difference in failure probabilities between two consecutive simulations exceeds a certain threshold the process continues. The simulation ends when the failure probability saturates and remains practically stable, as the absolute difference does not exceed the threshold value, as Equation 5.1 shows:

$$\frac{\left|P_{fail}^{i} - P_{fail}^{i-1}\right|}{P_{fail}^{i-1}} < thres \tag{5.1}$$

where  $P_{fail}^{i-1}$  and  $P_{fail}^{i}$  are the failure probabilities after the (i-1)th and ith simulation, respectively, and thres is the selected threshold. Practically, the threshold value determines mainly the accuracy of the failure probability and secondarily how fast the failure probability converges to a steady value, whereas may vary among the circuits. Smaller threshold value means high accuracy but long execution time, whereas higher value means faster execution time but low accuracy. This process is described in Algorithm 4.

```
Algorithm 4 Failure probability saturation
```

```
Input: Convergence thresholds thres1 and thres2
```

Output: Circuit failure probability

- 1:  $P_{latch}^0 \leftarrow 1$
- 2: Inject a fault via ERROR GENERATION function
- 3: while  $\frac{\left|P_{fail}^{i}-P_{fail}^{i-1}\right|}{P_{fail}^{i-1}} > thres1$  do

  4: while  $\frac{\left|P_{fail}^{j}-P_{fail}^{j-1}\right|}{P_{fail}^{j-1}} > thres2$  do
- Apply new primary input vector
- $P_{latch}^{j} \leftarrow$  Calculate new failure probability 6:
- 7: end while
- Inject a fault via ERROR GENERATION function 8:
- $P_{latch}^{i} \leftarrow \text{Calculate new failure probability}$
- 10: end while
- 11:  $P_{latch}^{final} \leftarrow \text{Saturated failure probability}$

First, the failure probability is initialized to 1 and then the simulation starts by injecting a fault and calculating the failure probability for a random PI vector. To model logical masking sufficiently, we need to apply different PI vectors. Thus, the failure probability is re-calculated taking into account the different PIs, until the absolute difference between two consecutive calculations is less than the value of thres2. Subsequently, a fault is injected de novo and after failure probability calculation we check if the condition for thres1 is true or false. If it is true, the failure probability is saturated and the simulation ends. It should be noted that thres1 and thres2 can be different, since the impact of different PI vectors on the failure probability is smaller than the impact of fault injection.

Figure 5-8 shows the convergence of the failure probabilities and the number of simulations needed to obtain the result for some benchmarks. We note that the small-scale designs (i.e., s27 and s526) converge faster than the others, which is reasonable considering the complexity of the large benchmarks.



Figure 5-8: Failure probability convergence.

Such an approach reduces significantly the execution time but not at the expense of accuracy. In fact, the simulations provide satisfactory results in terms of speed-up according to Table 5.7.

To sum up, this process practically reduces the number of iterations for the dif-

Table 5.7: Runtimes and speed-up of SER estimation execution on some ISCAS '89 benchmarks

| Benchmark | Old approach | New approach | Speed-up |
|-----------|--------------|--------------|----------|
| s820      | <1s          | <1s          | 1×       |
| s1423     | <1s          | <1s          | 1×       |
| s5378     | 6s           | 1.4s         | 4.3×     |
| s15850    | 135s         | 12s          | 11.3×    |
| s35932    | 630s         | 128s         | 4.9×     |
| s38417    | 19080s       | 130s         | 146.8×   |

ferent PIs and fault injections. Another solution to cope with the large execution times by reducing the number of iterations for the different PIs is the utilization of an appropriate data type for the modeling of logical masking. The data type that fits in this context and is utilized to store the logic states of each node is the unsigned integer and the long (long) unsigned integer. The advantage of these data types is that they utilize 16 and 32 (64) bits, respectively, to represent each number. Thus, we are able to utilize such types to excite up to 64 PIs in order to cover all logic input combinations, and thus model logical masking sufficiently. At the same time the execution time is confined, which is important for a Monte Carlo-based simulation. However, for the large-scale circuits this method is not sufficient since these data types cannot cover more than 64 inputs, thus covering a subset of different PI vectors could underestimate SER. This technique though, requires modifications in the way the SER is calculated and specifically the modeling and merging of the three masking effects at the FF inputs.

# Chapter 6

# Soft Errors in FDSOI Technology

# 6.1 About Silicon On Insulator Technology

The MOSFET is the main building component of the microelectronic ICs. Although the notion of the MOSFET as a device for controlling and amplifying electric current was firstly introduced in the early 1930s [89], its fabrication delayed a few decades due to the unawareness of oxidation and semiconductor properties. Since then, the MOSFET has become the dominant semiconductor device utilized in ICs, whereas the structure of MOSFETs that prevailed until recently is the planar Bulk technology. Generally, Bulk CMOS refers to a chip built upon a standard silicon wafer, that is, a thin slice of monocrystalline silicon serving as a substrate. Moreover, a planar manufacturing process considers a 2-dimensional projection of the circuit, which facilitates the microfabrication processes, such as doping, thermal oxidation, etching, ion implantation, polysilicon and metal deposition, etc., unlike contemporary 3-dimensional structures, such as double gate MOSFETS and FinFETs.

A representation of a typical nMOSFET structure is shown in Figure 6-1. The nMOSFET device constists of a uniformly p-type doped silicon substrate called body and two highly doped individual n-type regions called source and drain where the metal contacts are attached above. The "n+" notation indicates the high n-type concentration. The gate which is located above the body and controls the charge flow in the n-channel between source and drain, is made from n-type polycrystalline



Figure 6-1: Structure of a typical bulk nMOSFET device.

silicon. Also, the gate is insulated from the rest device regions by a thin dielectric layer, usually silicon dioxide ( $SiO_2$ ). Correspondingly, a pMOSFET consists of an n-type body, p+ source and drain regions, and p-type polysilicon gate.

Over the past decades, the continuous decrease in transistor feature size following the Moore's Law has resulted in higher operating frequencies and lower power consumption of ICs. However, the ever-shrinking trend, especially below 22nm sizes, has induced difficulties in meeting the power consumption requirements due to the leakage issues that deteriorate. Thus, although the planar Bulk CMOS is still the most prevalent in the fabrication process of semiconductor devices, the industry has directed its attention to the reconsideration of the manufacturing process and the development of alternative 3D technologies. The successor of the traditional planar Bulk technology is another planar technology called Silicon On Insulator (SOI). In SOI technology an additional ultra-thin layer of insulator, usually silicon dioxide, called Buried Oxide (BOX), is placed within the substrate, with various methods, as shown in Figure 6-2(a). This figure shows two types of SOI devices, the Partially-Depleted SOI nMOSFET (PDSOI) and the Fully-Depleted SOI nMOSFET (FDSOI). The difference between these variations is that in the case of simple PDSOI the silicon layer above the BOX is such thick that there is a silicon gap between the depletion region and the BOX, whereas in the case of FDSOI the depletion region covers the whole silicon region between the gate and the BOX. This happens due to the thin sil-



Figure 6-2: Structure of typical (a) PDSOI and (b) FDSOI nMOSFET devices.

icon layer above the BOX, rendering the transistor fully depleted. The simple PDSOI MOSFET behaves more or less like the Bulk MOSFET. Also, note that the rest of the manufacturing steps are the same with Bulk.

The advantages of this technology over the conventional bulk technology have been reported to be significant for feature sizes below 22nm. The most prevalent advantage is that it contributes to the reduction in leakage compared to the sub-22nm Bulk transistors. However, the fabrication of specialized SOI substrates is more expensive than the Bulk substrates, due to the additional steps needed.

## 6.2 Heavy Ion Strike TCAD Characterization

#### 6.2.1 About TCAD simulations

Technology Computer-Aided Design (TCAD) is a category of EDA tools that refers to the modeling, simulation and optimizing of semiconductor process technologies and devices. Such tools model the physical and electrical properties and behavior of the semiconductor devices by solving numerically fundamental physical partial differential equations, such as Poisson, potential, transport, diffusion, electric field equations, etc. These equations are solved in time and space at the regions of the device. Thus, the virtual semiconductor device—either 2-Dimensional or 3-Dimensional—needs to be discretized into a finite number of sparse grids, thus forming a network of grids

called a mesh, on which the afforementioned equations are solved iteratively until they converge to a solution. A denser mesh, which means a large number of grids, provides a more accurate solution, albeit at the expense of simulation time. Due to the deep physical background of TCAD tools, they provide accurate prediction of the semiconductor device behavior which is not restricted to a particular technology, but is applicable to a wide range of technologies. Since they constitute a fast and inexpensive modeling approach, they are able to substitute the expensive and time-consuming procedure of manufacturing and simulating testing wafers during the early development stages of a new semiconductor technology. Also, the variety of physical models and parameters that a TCAD tool provides, allows for different simulation setups and calibrations to obtain an accurate result and optimize the semiconductor technology, before proceeding to the device fabrication.

The TCAD tools are widely used in semiconductor industry and academia—for research purposes—as well. For the purpose of this dissertation the Synopsys<sup>®</sup> TCAD Sentaurus<sup>™</sup> is used, as it is a dominant simulation software and provides an integrated framework combined with a powerful and user-friendly GUI environment. A typical flow of TCAD Sentaurus includes initially the creation of the virtual device structure with Sentaurus Process modeling environment. Once the device structure is generated, it needs to be re-meshed for optimization of efficiency and robustness. Then, Sentaurus Device is used to simulate the electrical behavior of the particular device and finally Sentaurus Visual and Inspect deliver the visualization of the simulation results and their plots, respectively.

## 6.2.2 Simulation setup

The first step for the TCAD simulations was to define the devices by specifying their physical characteristics, such as the gate length and the device width. For the purpose of this dissertation, we utilized a typical built-in nMOSFET Bulk device from TCAD Sentaurus. Subsequently, based on this structure, we created the pMOSFET Bulk device structure by making the appropriate calibrations. Also, we created the nMOSFET and pMOSFET for the FDSOI technology. Table 6.1 presents some of

Table 6.1: Physical parameters of nMOS devices

| Parameters | nMOS Bulk(nm) | nMOS FDSOI(nm) |
|------------|---------------|----------------|
| $L_g$      | 45            | 45             |
| $L_{s,d}$  | 25            | 25             |
| W          | 100           | 100            |
| $T_{s,d}$  | 75            | 50             |
| $T_{ox}$   | 3             | 3              |
| $T_{box}$  | -             | 25             |

the basic physical parameters of the nMOSFET devices for the Bulk and FDSOI technologies, respectively. As discussed, the characterization of the ionizing particle strike on the nMOS and pMOS devices should be performed while the devices are at the off state. Therefore, for the nMOS device simulations the drain contact was biased at the supply voltage and the source, gate and substrate contacts were biased at the ground. On the other hand, for the pMOS device simulations, the drain contact was biased at the ground, and the source, gate and substrate contacts were biased at supply voltage. Also, for the subsequent simulations, the supply voltage was considered at 1.1V, unless it is referred to explicitly.

One of the features of TCAD tools is that they provide an effective method to model and simulate the impact of SEEs on semiconductor devices through their integrated physics background. Thus, the expensive and time-consuming real-time experiments with neutron beam setups become obsolete and inefficient. To identify the behavior of FDSOI technology and compare it with the conventional bulk, several TCAD simulations are conducted. TCAD Sentaurus models the radiation incidents due to gamma radiation, alpha particles and heavy ions. For the SET pulse characterization, the heavy ion model is utilized, which is triggered by including the keyword HeavyIon in a properly defined Physics section. A typical HeavyIon statement is presented in Algorithm 5. There are several parameters that are needed to model a heavy ion strike on a semiconductor device. First, Location option which is the point where the heavy ion enters the device, Direction option which is the direc-

#### **Algorithm 5** A Heavy Ion model statement into a Physics section

```
Physics { ... 
 HeavyIon (  Direction = (0, 1)   Location = (1.5, 0)   Time = 1.0e - 13   Length = [1e - 4 \ 1.5e - 4 \ 1.6e - 4 \ 1.7e - 4]   LET\_f = [1e6 \ 2e6 \ 3e6 \ 4e6]   Wt\_hi = [0.3e - 4 \ 0.2e - 4 \ 0.25e - 4 \ 0.1e - 4]   Exponential   PicoCoulomb  ) ... 
 } ... 
}
```

tion of the motion of the ion, and Time option which indicates the heavy ion strike moment are specified. Then, the  $LET\_f$ ,  $Wt\_hi$ , and Length parameters define the LET function, the perpendicular distance from the track function and the length of the heavy ion, respectively. Finally, the Gaussian or Exponential shape for the spatial distribution R(w) of the heavy ion track are specified with Exponential and Gaussian options. The PicoCoulomb option is used to switch the units for  $LET\_f$ ,  $Wt\_hi$ , and Length.

With the TCAD simulations we are able to model the disturbance that a heavy ion incident provokes in device operation. Figure 6-3 illustrates a 3D MOSFET device, which is affected by a heavy ion strike on its drain, resulting in a charge generation along the heavy ion track into the silicon.

#### 6.2.3 NMOS and PMOS device simulations

The dominant factor for the determination of the charge generation when a heavy ion strikes a sensitive device node is the energy of the heavy ion, that is the LET that delivers as it penetrates the semiconductor. We performed numerous TCAD simulations to identify the impact of the LET on the charge generation, which reflects



Figure 6-3: 3D illustration of a heavy ion strike on the drain of nMOSFET and the induced charge generation along its track.

on the drain current ( $I_{drain}$ ) of the device. Figure 6-4 illustrates the drain currents for various LET values, ranging from  $1MeV - cm^2/mg$  to  $70MeV - cm^2/mg$ , for both nMOS and pMOS devices and for both Bulk and FDSOI technologies. Also, the width of the heavy ion was set at  $0.05\mu m$  and the length of the ion track at  $0.15\mu m$ . Generally, the drain current increases linearly as the LET increases, except for the small LET values (smaller than  $10MeV - cm^2/mg$ ). Also, the drain current for the Bulk technology is greater compared to the FDSOI, indicating that the FDSOI-based devices are more resilient to radiation hazards. This trend is identical for both nMOS and pMOS devices. However, the drain current in nMOS devices is greater for the same LETs, compared to the pMOS. We infer from both plots that the off nMOS devices are more vulnerable to heavy ions than the off pMOS devices.

As a heavy ion strikes the drain, the resultant charge generation is reflected to the drain current. Figure 6-5 shows the generated current pulses at the drain of the nMOS devices for both Bulk and FDSOI technologies when heavy ions of three different LETs are injected. The current spike elevates as the energy increases indicating the significance of heavy ion severity to the generation of SETs and, subsequently, to the emergence of soft errors.



Figure 6-4: SET drain current for various LETs and both Bulk and FDSOI technologies when (a) nMOSFET and (b) pMOSFET are affected.

The angle of the heavy ion strike holds a crucial role in charge generation. According to the simulations performed for different strike angles, as shown in Figure 6-6, when a heavy ion strikes the drain horizontally, it generates the less current, especially for FDSOI technology. On the other hand, the drain current reaches its maximum value when the heavy ion strikes the transistor vertically, indicating the significance of angle consideration in the susceptibility evaluation.

#### 6.2.4 Mixed-mode CMOS Inverter simulation

TCAD tools, like Sentaurus Device, provide the capability of performing mixed-mode simulations, that is, a feature that allows for the combination of semiconductor devices based on physical models (TCAD) with devices described with compact models (SPICE). The main advantage of this type of simulation is that enhances the SPICE circuit simulation with the physical aspects of the devices, making it more realistic and accurate. Also, note that the compact model can be combined with any type of device, regarding its dimensionality (e.g., 2D or 3D).

A mixed-mode simulation can be performed also to identify the impact of ionizing radiation on logic gates and circuits. In this way, the simulation of a heavy ion striking a node becomes more accurate, as we are able to incorporate in the simulation the





Figure 6-5: nMOS drain current pulses for various LETs of (a) Bulk and (b) FDSOI technologies.

physical devices (i.e., transistors) that are modeled with a powerful TCAD tool. Figure 6-7 demonstrates a transient analysis of an Inverter, performed with a mixed-mode simulation, when two heavy ions strike the gate at different time moments. The



Figure 6-6: SET drain current for three different strike angles and both Bulk and FDSOI technologies when (a) nMOSFET and (b) pMOSFET are affected.

first strikes the off pMOS transistor device and the second the off nMOS, generating at the output a small and a large glitch, respectively. To sum up, the fact that such simulations are more accurate, may be crucial in order to exploit them and develop radiation-hardened designs.



Figure 6-7: Transient plot of a CMOS Inverter when two heavy ions strike the nMOS and pMOS transistor at different time moments.

# Chapter 7

## Conclusions and Future Work

This chapter summarizes the main contributions and outcomes of this research and concludes this dissertation, making some recommendations and giving future directions, exploiting the results obtained from this work, to further propel the research in this field.

### 7.1 Conclusion

Over the past decades, the reliability of modern ICs has evolved into a major concern, mostly due to the continuous downscaling of the CMOS technology and its implications, such as high operating frequencies, low power consumption, etc. This technology trend impacts on the susceptibility of semiconductor devices to radiation-induced soft errors as well, which is expected to deteriorate, rendering the upcoming generation systems more vulnerable to such errors. To combat this undesirable hazard, it is crucial to comprehend the mechanism of soft errors and develop methodologies for the accurate SER estimation. Thus, the knowledge over the circuit vulnerability may be further exploited in the design process to develop reliable chips through radiation-hardening techniques.

To this end, in this dissertation we have dealt with soft errors in the combinational logic of ICs. Particularly, a methodology for the SER estimation taking into consideration SEMTs is presented. The Monte Carlo simulations along with the detailed

modeling of the masking mechanisms provide an accurate result. Another key factor in the development of an accurate tool is the SET pulse characterization through electrical-level simulations. A detailed procedure for the identification of SEMTs is also presented.

The proposed SER estimation framework is evaluated on a set of ISCAS '89 benchmark circuits. In particular, the identification of both the most vulnerable sites of a circuit and the most vulnerable logic gates provides an overview of circuit reliability, allowing for selective hardening strategies. A comparison between the features of the proposed framework and these of other relative approaches indicate the accuracy of our approach. This is confirmed from the verification results with HSPICE showing a small divergence.

Also, our work examines the behavior of the state-of-the-art FDSOI technology under a radiation environment performing TCAD simulations. The results are compared with those of the conventional Bulk technology, indicating the differences with respect to susceptibility.

### 7.2 Future Work

An integrated framework for SER estimation is presented in this dissertation. However, there are still improvements that may be applied. Since the proposed tool is based on Monte Carlo simulations the evaluation of the large-scale circuits is extremely time-consuming process, even if various speed-up approaches are applied. Thus, further acceleration techniques may improve runtime.

The semiconductor device technology advances continuously rendering the previous technologies obsolete in the course of time. Thus, the EDA tools need to stay updated and sustain the compatibility among the state-of-the-art technologies. The proposed SER estimation tool can be easily extended to provide realistic results regarding the circuit reliability in technologies below 20nm, except for the 45nm that currently supports. Since the main methodology is robust, a couple of critical parameters of the new technology, such as SET pulse width or timing parameters, should be

imported into our tool. Thus, a pre-characterization phase is the key for an accurate SER estimation for various technologies.

Hopefully, this work will constitute a baseline methodology for further SER estimation approaches, paving the way for other researchers to expand it or contribute to this direction with advanced algorithms or, probably, the use of AI and deep learning, the fastest growing technology in the world today.

# Appendix A

# Overall SER Results of ISCAS '89 Benchmarks on 45nm and 15nm Technologies

In this section we demonstrate the results regarding the SER for the whole ISCAS '89 benchmark suite and for both 45nm and 15nm technologies. The SER was calculated in terms of FIT, which encompasses—except for the raw failure probability—the actual circuit die size and the radiation flux. In this context, we considered the neutron flux at sea level at New York City, which corresponds to  $20.329 \ neutrons/cm^2 - h$  according to [12], and the temperature at 25 °C. Table A.1 and Table A.2 report the SER results for 45nm and 15nm technologies, respectively. Note that the circuit information is based on the pre-synthesis verilog description file.

Table A.1: SER evaluation results, obtained from the proposed tool, for the ISCAS '89 benchmarks and 45nm technology. The number of nodes, primary inputs, gates and D-FFs indicate the benchmark complexity, Fail. Rate and FIT denote the SER as failure probability and in terms of FIT, respectively, whereas Ex. Time is the average execution time.

| (      | Circuit I | nforn | nation | SER  |            | D                     |          |
|--------|-----------|-------|--------|------|------------|-----------------------|----------|
| Bench. | Nodes     | PIs   | Gates  | FFs  | Fail. Rate | FIT                   | Ex. Time |
| s27    | 17        | 4     | 13     | 3    | 0.1527     | $1.00 \times 10^{-6}$ | < 1s     |
| s298   | 169       | 3     | 166    | 14   | 0.0976     | $3.72 \times 10^{-6}$ | < 1s     |
| s344   | 240       | 9     | 231    | 15   | 0.1314     | $5.71 \times 10^{-6}$ | < 1s     |
| s349   | 224       | 9     | 215    | 15   | 0.1902     | $8.10 \times 10^{-6}$ | < 1s     |
| s382   | 196       | 3     | 193    | 21   | 0.0974     | $4.96 \times 10^{-6}$ | < 1s     |
| s400   | 203       | 3     | 200    | 21   | 0.1341     | $6.92 \times 10^{-6}$ | < 1s     |
| s420   | 252       | 19    | 233    | 16   | 0.0708     | $3.32 \times 10^{-6}$ | < 1s     |
| s526   | 280       | 3     | 277    | 21   | 0.1161     | $6.94 \times 10^{-6}$ | < 1s     |
| s641   | 517       | 35    | 482    | 19   | 0.0528     | $2.78 \times 10^{-6}$ | < 1s     |
| s713   | 539       | 35    | 504    | 19   | 0.0519     | $2.74 \times 10^{-6}$ | < 1s     |
| s820   | 443       | 18    | 425    | 5    | 0.0347     | $2.10 \times 10^{-6}$ | < 1s     |
| s953   | 496       | 16    | 480    | 29   | 0.0624     | $6.71 \times 10^{-6}$ | < 1s     |
| s1238  | 768       | 14    | 754    | 18   | 0.0427     | $5.27 \times 10^{-6}$ | < 1s     |
| s1423  | 1008      | 17    | 991    | 74   | 0.0479     | $9.70 \times 10^{-6}$ | < 1s     |
| s1488  | 1211      | 8     | 1203   | 6    | 0.0067     | $8.11 \times 10^{-7}$ | < 1s     |
| s5378  | 3053      | 35    | 3018   | 179  | 0.0841     | $3.94 \times 10^{-5}$ | 2s       |
| s9234  | 7002      | 19    | 6983   | 228  | 0.0718     | $2.62 \times 10^{-5}$ | < 1s     |
| s13207 | 9608      | 31    | 9577   | 669  | 0.0423     | $5.36 \times 10^{-5}$ | 7.3s     |
| s15850 | 12115     | 14    | 12101  | 597  | 0.0413     | $5.30 \times 10^{-5}$ | 12s      |
| s35932 | 21278     | 35    | 21243  | 1728 | 0.0069     | $2.80 \times 10^{-5}$ | 128s     |
| s38417 | 24874     | 28    | 23815  | 1636 | 0.0714     | $2.79 \times 10^{-4}$ | 130s     |
| s38584 | 21407     | 38    | 20679  | 1426 | 0.0084     | $3.01 \times 10^{-5}$ | 124s     |

Table A.2: SER evaluation results, obtained from the proposed tool, for the ISCAS '89 benchmarks and 15nm technology. The number of nodes, primary inputs, gates and D-FFs indicate the benchmark complexity, Fail. Rate and FIT denote the SER as failure probability and in terms of FIT, respectively, whereas Ex. Time is the average execution time.

| (      | Circuit I | nform | nation | SER  |            | D #                   |          |
|--------|-----------|-------|--------|------|------------|-----------------------|----------|
| Bench. | Nodes     | PIs   | Gates  | FFs  | Fail. Rate | FIT                   | Ex. Time |
| s27    | 17        | 4     | 13     | 3    | 0.2917     | $4.83 \times 10^{-7}$ | < 1s     |
| s298   | 169       | 3     | 166    | 14   | 0.2457     | $2.54 \times 10^{-6}$ | < 1s     |
| s344   | 240       | 9     | 231    | 15   | 0.1842     | $2.23 \times 10^{-6}$ | < 1s     |
| s349   | 224       | 9     | 215    | 15   | 0.2458     | $3.01 \times 10^{-6}$ | < 1s     |
| s382   | 196       | 3     | 193    | 21   | 0.1312     | $1.88 \times 10^{-6}$ | < 1s     |
| s400   | 203       | 3     | 200    | 21   | 0.2206     | $3.22 \times 10^{-6}$ | < 1s     |
| s420   | 252       | 19    | 233    | 16   | 0.0947     | $1.23 \times 10^{-6}$ | < 1s     |
| s526   | 280       | 3     | 277    | 21   | 0.1875     | $3.08 \times 10^{-6}$ | < 1s     |
| s641   | 517       | 35    | 482    | 19   | 0.0971     | $1.45 \times 10^{-6}$ | < 1s     |
| s713   | 539       | 35    | 504    | 19   | 0.0852     | $1.28 \times 10^{-6}$ | < 1s     |
| s820   | 443       | 18    | 425    | 5    | 0.0588     | $9.78 \times 10^{-7}$ | < 1s     |
| s953   | 496       | 16    | 480    | 29   | 0.0957     | $2.96 \times 10^{-6}$ | < 1s     |
| s1238  | 768       | 14    | 754    | 18   | 0.0782     | $2.69 \times 10^{-6}$ | < 1s     |
| s1423  | 1008      | 17    | 991    | 74   | 0.0613     | $3.42 \times 10^{-6}$ | < 1s     |
| s1488  | 1211      | 8     | 1203   | 6    | 0.0101     | $3.42 \times 10^{-7}$ | < 1s     |
| s5378  | 3053      | 35    | 3018   | 179  | 0.1282     | $1.66 \times 10^{-5}$ | 2s       |
| s9234  | 7002      | 19    | 6983   | 228  | 0.1103     | $1.11 \times 10^{-5}$ | < 1s     |
| s13207 | 9608      | 31    | 9577   | 669  | 0.0747     | $2.71 \times 10^{-5}$ | 7.3s     |
| s15850 | 12115     | 14    | 12101  | 597  | 0.0781     | $2.75 \times 10^{-5}$ | 12s      |
| s35932 | 21278     | 35    | 21243  | 1728 | 0.0098     | $1.11 \times 10^{-5}$ | 128s     |
| s38417 | 24874     | 28    | 23815  | 1636 | 0.0922     | $1.00 \times 10^{-4}$ | 130s     |
| s38584 | 21407     | 38    | 20679  | 1426 | 0.0182     | $1.83 \times 10^{-5}$ | 124s     |

# Bibliography

- [1] P. Hazucha and C. Svensson, "Impact of cmos technology scaling on the atmospheric neutron soft error rate," *IEEE Transactions on Nuclear science*, vol. 47, no. 6, pp. 2586–2594, 2000.
- [2] N. Seifert, P. Slankard, M. Kirsch, B. Narasimham, V. Zia, C. Brookreson, A. Vo, S. Mitra, B. Gill, and J. Maiz, "Radiation-induced soft error rates of advanced cmos bulk devices," in 2006 IEEE International Reliability Physics Symposium Proceedings, pp. 217–225, 2006.
- [3] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," in *Proceedings International Conference on Dependable Systems and Networks*, pp. 389–398, 2002.
- [4] C. Constantinescu, "Trends and challenges in vlsi circuit reliability," *IEEE micro*, vol. 23, no. 4, pp. 14–19, 2003.
- [5] R. Baumann, "The impact of technology scaling on soft error rate performance and limits to the efficacy of error correction," in *Digest. International Electron Devices Meeting*,, pp. 329–332, 2002.
- [6] M. Nicolaidis, Soft errors in modern electronic systems, vol. 41. Springer Science & Business Media, 2010.
- [7] D. Binder, E. C. Smith, and A. Holman, "Satellite anomalies from galactic cosmic rays," *IEEE Transactions on Nuclear Science*, vol. 22, no. 6, pp. 2675–2680, 1975.
- [8] T. C. May and M. H. Woods, "Alpha-particle-induced soft errors in dynamic memories," *IEEE transactions on Electron devices*, vol. 26, no. 1, pp. 2–9, 1979.
- [9] S. Kirkpatrick, "Modeling diffusion and collection of charge from ionizing radiation in silicon devices," *IEEE Transactions on Electron Devices*, vol. 26, no. 11, pp. 1742–1753, 1979.
- [10] J. F. Ziegler and W. A. Lanford, "Effect of cosmic rays on computer memories," Science, vol. 206, no. 4420, pp. 776–788, 1979.
- [11] —, "The effect of sea level cosmic rays on electronic devices," *Journal of applied physics*, vol. 52, no. 6, pp. 4305–4312, 1981.

- [12] J. F. Ziegler, "Terrestrial cosmic rays," *IBM journal of research and development*, vol. 40, no. 1, pp. 19–39, 1996.
- [13] J. F. Ziegler, H. W. Curtis, H. P. Muhlfeld, C. J. Montrose, B. Chin, M. Nicewicz, C. Russell, W. Y. Wang, L. B. Freeman, P. Hosier et al., "Ibm experiments in soft fails in computer electronics (1978–1994)," IBM journal of research and development, vol. 40, no. 1, pp. 3–18, 1996.
- [14] R. Baumann, T. Hossain, S. Murata, and H. Kitagawa, "Boron compounds as a dominant source of alpha particles in semiconductor devices," in *Proceedings of 1995 IEEE International Reliability Physics Symposium*, pp. 297–302, 1995.
- [15] R. C. Baumann, "Soft errors in advanced semiconductor devices-part i: the three radiation sources," *IEEE Transactions on device and materials reliability*, vol. 1, no. 1, pp. 17–22, 2001.
- [16] R. C. Baumann and E. B. Smith, "Neutron-induced boron fission as a major source of soft errors in deep submicron sram devices," in 2000 IEEE International Reliability Physics Symposium Proceedings. 38th Annual (Cat. No. 00CH37059), pp. 152–157, 2000.
- [17] R. C. Baumann, "Radiation-induced soft errors in advanced semiconductor technologies," *IEEE Transactions on Device and materials reliability*, vol. 5, no. 3, pp. 305–316, 2005.
- [18] P. E. Dodd and L. W. Massengill, "Basic mechanisms and modeling of single-event upset in digital microelectronics," *IEEE Transactions on Nuclear Science*, vol. 50, no. 3, pp. 583–602, 2003.
- [19] L. Zheng, C. Shu-Ming, C. Jian-Jun, Q. Jun-Rui, and L. Rong-Rong, "Parasitic bipolar amplification in a single event transient and its temperature dependence," *Chinese Physics B*, vol. 21, no. 9, p. 099401, 2012.
- [20] G. Messenger, "Collection of charge on junction nodes from ion tracks," *IEEE Transactions on Nuclear Science*, vol. 29, no. 6, pp. 2024–2031, 1982.
- [21] G. Srinivasan, P. Murley, and H. Tang, "Accurate, predictive modeling of soft error rate due to cosmic rays and chip alpha radiation," in *Proceedings of 1994 IEEE International Reliability Physics Symposium*, pp. 12–16, 1994.
- [22] V. Chandra and R. Aitken, "Impact of technology and voltage scaling on the soft error susceptibility in nanoscale cmos," in 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems, pp. 114–122, 2008.
- [23] S. Sayil, Soft error mechanisms, modeling and mitigation. Springer, 2016.
- [24] P. Roche, J.-L. Autran, G. Gasiot, and D. Munteanu, "Technology downscaling worsening radiation effects in bulk: Soi to the rescue," in 2013 IEEE International Electron Devices Meeting, pp. 31–1, 2013.

- [25] N. Kaul, B. Bhuva, and S. Kerns, "Simulation of seu transients in cmos ics," *IEEE Transactions on Nuclear Science*, vol. 38, no. 6, pp. 1514–1520, 1991.
- [26] T. Juhnke and H. Klar, "Calculation of the soft error rate of submicron cmos logic circuits," *IEEE Journal of Solid-State Circuits*, vol. 30, no. 7, pp. 830–834, 1995.
- [27] P. C. Murley and G. Srinivasan, "Soft-error monte carlo modeling program, semm," *IBM Journal of Research and Development*, vol. 40, no. 1, pp. 109–118, 1996.
- [28] P. Dahlgren and P. Lidén, "A switch-level algorithm for simulation of transients in combinational logic," in *Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers*, pp. 207–216, 1995.
- [29] A. Dharchoudhury, S.-M. Kang, H. Cha, and J. H. Patel, "Fast timing simulation of transient faults in digital circuits," in *ICCAD*, vol. 94, pp. 719–722, 1994.
- [30] H. Cha, E. M. Rudnick, J. H. Patel, R. K. Iyer, and G. S. Choi, "A gate-level simulation environment for alpha-particle-induced transient faults," *IEEE Transactions on Computers*, vol. 45, no. 11, pp. 1248–1256, 1996.
- [31] C. Zhao, X. Bai, and S. Dey, "A scalable soft spot analysis methodology for compound noise effects in nano-meter circuits," in *Proceedings of the 41st annual Design Automation Conference*, pp. 894–899, 2004.
- [32] R. Rajaraman, J. Kim, N. Vijaykrishnan, Y. Xie, and M. J. Irwin, "Seat-la: A soft error analysis tool for combinational logic," in 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06), pp. 4–pp, 2006.
- [33] D. Bountas and G. I. Stamoulis, "Carrot—a tool for fast and accurate soft error rate estimation," in *International Workshop on Embedded Computer Systems*, pp. 331–338. Springer, 2006.
- [34] R. R. Rao, K. Chopra, D. T. Blaauw, and D. M. Sylvester, "Computing the soft error rate of a combinational logic circuit using parameterized descriptors," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Sys*tems, vol. 26, no. 3, pp. 468–479, 2007.
- [35] Y. S. Dhillon, A. U. Diril, and A. Chatterjee, "Soft-error tolerance analysis and optimization of nanometer circuits," in *Design*, *Automation*, and *Test in Europe*, pp. 389–400. Springer, 2008.
- [36] Y.-H. Kuo, H.-K. Peng, and C. H.-P. Wen, "Accurate statistical soft error rate (sser) analysis using a quasi-monte carlo framework with quality cell models," in 2010 11th International Symposium on Quality Electronic Design (ISQED), pp. 831–838, 2010.

- [37] M. Anglada, R. Canal, J. L. Aragón, and A. González, "Maskit: Soft error rate estimation for combinational circuits," in 2016 IEEE 34th International Conference on Computer Design (ICCD), pp. 614–621, 2016.
- [38] B. Liu and L. Cai, "Monte carlo reliability model for single-event transient on combinational circuits," *IEEE Transactions on Nuclear Science*, vol. 64, no. 12, pp. 2933–2937, 2017.
- [39] J. S. Kim, C. Nicopoulos, N. Vijaykrishnan, Y. Xie, and E. Lattanzi, "A probabilistic model for soft-error rate estimation in combinational logic," in *Proceedings* of the International Workshop on Probabilistic Analysis Techniques for Real-time and Embedded Systems, (Italy), 2004.
- [40] F. Wang and V. D. Agrawal, "Soft error rate determination for nanoscale sequential logic," in 2010 11th International Symposium on Quality Electronic Design (ISQED), pp. 225–230, 2010.
- [41] F. Wang and Y. Xie, "Soft error rate analysis for combinational logic using an accurate electrical masking model," *IEEE Transactions on Dependable and Secure Computing*, vol. 8, no. 1, pp. 137–146, 2009.
- [42] S. Krishnaswamy, G. F. Viamontes, I. L. Markov, and J. P. Hayes, "Accurate reliability evaluation and enhancement via probabilistic transfer matrices," in *Design, Automation and Test in Europe*, pp. 282–287, 2005.
- [43] M. Zhang and N. R. Shanbhag, "Soft-error-rate-analysis (sera) methodology," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 25, no. 10, pp. 2140–2155, 2006.
- [44] A. C.-C. Chang, R. H.-M. Huang, and C. H.-P. Wen, "Casser: a closed-form analysis framework for statistical soft error rate," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 21, no. 10, pp. 1837–1848, 2012.
- [45] H.-M. Huang and C. H.-P. Wen, "Fast-yet-accurate statistical soft-error-rate analysis considering full-spectrum charge collection," *IEEE Design & Test*, vol. 30, no. 2, pp. 77–86, 2013.
- [46] H.-K. Peng, H.-M. Huang, Y.-H. Kuo, and C. H.-P. Wen, "Statistical soft error rate (sser) analysis for scaled cmos designs," ACM transactions on design automation of electronic systems (TODAES), vol. 17, no. 1, pp. 1–24, 2012.
- [47] S. Krishnaswamy, S. M. Plaza, I. L. Markov, and J. P. Hayes, "Signature-based ser analysis and design of logic circuits," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 28, no. 1, pp. 74–86, 2008.
- [48] R. H.-M. Huang and C. H.-P. Wen, "Advanced soft-error-rate (ser) estimation with striking-time and multi-cycle effects," in *Proceedings of the 51st Annual Design Automation Conference*, pp. 1–6, 2014.

- [49] D. Rossi, M. Omana, F. Toma, and C. Metra, "Multiple transient faults in logic: An issue for next generation ics?" in 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05), pp. 352–360, 2005.
- [50] J. D. Black, P. E. Dodd, and K. M. Warren, "Physics of multiple-node charge collection and impacts on single-event characterization and soft error rate prediction," *IEEE Transactions on Nuclear Science*, vol. 60, no. 3, pp. 1836–1851, 2013.
- [51] N. P. Rao and M. P. Desai, "Neutron induced strike: On the likelihood of multiple bit-flips in logic circuits," arXiv preprint arXiv:1612.08239, 2016.
- [52] N. Miskov-Zivanov and D. Marculescu, "Multiple transient faults in combinational and sequential circuits: A systematic approach," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 29, no. 10, pp. 1614–1627, 2010.
- [53] M. Fazeli, S. N. Ahmadian, S. G. Miremadi, H. Asadi, and M. B. Tahoori, "Soft error rate estimation of digital circuits in the presence of multiple event transients (mets)," in 2011 Design, Automation & Test in Europe, pp. 1–6, 2011.
- [54] R. Rajaei, M. Tabandeh, and M. Fazeli, "Soft error rate estimation for combinational logic in presence of single event multiple transients," *Journal of Circuits*, Systems and Computers, vol. 23, no. 06, p. 1450091, 2014.
- [55] B. T. Kiddie, W. H. Robinson, and D. B. Limbrick, "Single-event multiple-transients (semt): Circuit characterization and analysis," in *IEEE Workshop Silicon Errors in Logic-System Effects (SELSE)*, 2013.
- [56] M. Ebrahimi, H. Asadi, and M. B. Tahoori, "A layout-based approach for multiple event transient analysis," in 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6, 2013.
- [57] H.-M. Huang and C. H.-P. Wen, "Layout-based soft error rate estimation framework considering multiple transient faults—from device to circuit level," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 35, no. 4, pp. 586–597, 2015.
- [58] Y. Du and S. Chen, "A novel layout-based single event transient injection approach to evaluate the soft error rate of large combinational circuits in complimentary metal-oxide-semiconductor bulk technology," *IEEE Transactions on Reliability*, vol. 65, no. 1, pp. 248–255, 2015.
- [59] J. Li and J. Draper, "Accelerated soft-error-rate (ser) estimation for combinational and sequential circuits," *ACM Transactions on Design Automation of Electronic Systems (TODAES)*, vol. 22, no. 3, pp. 1–21, 2017.

- [60] G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou, and G. I. Stamoulis, "A placement-aware soft error rate estimation of combinational circuits for multiple transient faults in cmos technology," in 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), pp. 1–6, 2018.
- [61] X. Cao, L. Xiao, J. Li, R. Zhang, S. Liu, and J. Wang, "A layout-based soft error vulnerability estimation approach for combinational circuits considering single event multiple transients (semts)," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 38, no. 6, pp. 1109–1122, 2018.
- [62] S. Cai, B. He, W. Wang, P. Liu, F. Yu, L. Yin, and B. Li, "Soft error reliability evaluation of nanoscale logic circuits in the presence of multiple transient faults," *Journal of Electronic Testing*, pp. 1–15, 2020.
- [63] C. Rusu, A. Bougerol, L. Anghel, C. Weulerse, N. Buard, S. Benhammadi, N. Renaud, G. Hubert, F. Wrobel, T. Carrière et al., "Multiple event transient induced by nuclear reactions in cmos logic cells," in 13th IEEE International On-Line Testing Symposium (IOLTS 2007), pp. 137–145, 2007.
- [64] G. I. Zebrev and A. M. Galimov, "Compact modeling and simulation of heavy ion-induced soft error rate in space environment: Principles and validation," *IEEE Transactions on Nuclear Science*, vol. 64, no. 8, pp. 2129–2135, 2017.
- [65] R. R. Rao, D. Blaauw, and D. Sylvester, "Soft error reduction in combinational logic using gate resizing and flipflop selection," in *Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design*, pp. 502–509, 2006.
- [66] N. Miskov-Zivanov and D. Marculescu, "Mars-s: Modeling and reduction of soft errors in sequential circuits," in 8th International Symposium on Quality Electronic Design (ISQED'07), pp. 893–898, 2007.
- [67] K.-C. Wu and D. Marculescu, "A low-cost, systematic methodology for soft error robustness of logic circuits," *IEEE transactions on very large scale integration* (VLSI) systems, vol. 21, no. 2, pp. 367–379, 2012.
- [68] M. Raji and B. Ghavami, "Soft error rate reduction of combinational circuits using gate sizing in the presence of process variations," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 25, no. 1, pp. 247–260, 2016.
- [69] A. H. El-Maleh and K. A. Daud, "Simulation-based method for synthesizing soft error tolerant combinational circuits," *IEEE Transactions on Reliability*, vol. 64, no. 3, pp. 935–948, 2015.
- [70] M. Ebrahimi, H. Asadi, R. Bishnoi, and M. B. Tahoori, "Layout-based modeling and mitigation of multiple event transients," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 35, no. 3, pp. 367–379, 2015.

- [71] C. Georgakidis, G. I. Paliaroutis, N. Sketopoulos, P. Tsoumanis, C. Sotiriou, N. Evmorfopoulos, and G. Stamoulis, "A layout-based soft error rate estimation and mitigation in the presence of multiple transient faults in combinational logic," in 2020 21st International Symposium on Quality Electronic Design (ISQED), pp. 231–236, 2020.
- [72] A. Dixit and A. Wood, "The impact of new technology on soft error rates," in 2011 International Reliability Physics Symposium, pp. 5B-4, 2011.
- [73] J. Furuta, C. Hamanaka, K. Kobayashi, and H. Onodera, "Measurement of neutron-induced set pulse width using propagation-induced pulse shrinking," in 2011 International Reliability Physics Symposium, pp. 5B–2, 2011.
- [74] A. Evans, M. Glorieux, D. Alexandrescu, C. B. Polo, and V. Ferlet-Cavrois, "Single event multiple transient (semt) measurements in 65 nm bulk technology," in 2016 16th European Conference on Radiation and Its Effects on Components and Systems (RADECS), pp. 1–6, 2016.
- [75] J. Xu, Y. Guo, R. Song, B. Liang, and Y. Chi, "Supply voltage and temperature dependence of single-event transient in 28-nm fdsoi mosfets," *Symmetry*, vol. 11, no. 6, p. 793, 2019.
- [76] B. Narasimham, B. L. Bhuva, R. D. Schrimpf, L. W. Massengill, M. J. Gadlage, O. A. Amusan, W. T. Holman, A. F. Witulski, W. H. Robinson, J. D. Black et al., "Characterization of digital single event transient pulse-widths in 130-nm and 90-nm cmos technologies," *IEEE Transactions on Nuclear Science*, vol. 54, no. 6, pp. 2506–2511, 2007.
- [77] T. Loveless, J. Kauppila, S. Jagannathan, D. Ball, J. Rowe, N. Gaspard, N. Atkinson, R. Blaine, T. Reece, J. Ahlbin *et al.*, "On-chip measurement of single-event transients in a 45 nm silicon-on-insulator technology," *IEEE Transactions on Nuclear Science*, vol. 59, no. 6, pp. 2748–2755, 2012.
- [78] V. Ferlet-Cavrois, P. Paillet, D. McMorrow, N. Fel, J. Baggio, S. Girard, O. Duhamel, J. Melinger, M. Gaillardin, J. Schwank et al., "New insights into single event transient propagation in chains of inverters—evidence for propagation-induced pulse broadening," *IEEE Transactions on Nuclear Science*, vol. 54, no. 6, pp. 2338–2346, 2007.
- [79] V. Ferlet-Cavrois, V. Pouget, D. McMorrow, J. Schwank, N. Fel, F. Essely, R. Flores, P. Paillet, M. Gaillardin, D. Kobayashi et al., "Investigation of the propagation induced pulse broadening (pipb) effect on single event transients in soi and bulk inverter chains," *IEEE Transactions on Nuclear Science*, vol. 55, no. 6, pp. 2842–2853, 2008.
- [80] G. Wirth, F. L. Kastensmidt, and I. Ribeiro, "Single event transients in logic circuits—load and propagation induced pulse broadening," *IEEE Transactions on Nuclear Science*, vol. 55, no. 6, pp. 2928–2935, 2008.

- [81] I. Sutherland, R. F. Sproull, B. Sproull, and D. Harris, Logical effort: designing fast CMOS circuits. Morgan Kaufmann, 1999.
- [82] W. C. Elmore, "The transient response of damped linear networks with particular regard to wideband amplifiers," *Journal of applied physics*, vol. 19, no. 1, pp. 55–63, 1948.
- [83] LEF/DEF Language Reference, 5th ed., Cadence Design Systems, Inc., 2655 Seely Avenue San Jose, CA, 95134, USA, 11 2009.
- [84] D. Radaelli, H. Puchner, S. Wong, and S. Daniel, "Investigation of multi-bit upsets in a 150 nm technology sram device," *IEEE Transactions on Nuclear Science*, vol. 52, no. 6, pp. 2433–2437, 2005.
- [85] G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou, and G. I. Stamoulis, "Set pulse characterization and ser estimation in combinational logic with placement and multiple transient faults considerations," *Technologies*, vol. 8, no. 1, p. 5, 2020.
- [86] D. Garyfallou, S. Simoglou, N. Sketopoulos, C. Antoniadis, C. P. Sotiriou, N. Evmorfopoulos, and G. Stamoulis, "Gate delay estimation with library compatible current source models and effective capacitance," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 29, no. 5, pp. 962–972, 2021.
- [87] "Si2 45nm & 15nm open cell library," https://si2.org/open-cell-library/, Silicon Integration Initiative, Inc., [Available online: accessed on 02 November 2020].
- [88] M. Gadlage, J. Ahlbin, B. Bhuva, L. Massengill, and R. Schrimpf, "Single event transient pulse width measurements in a 65-nm bulk cmos technology at elevated temperatures," in 2010 IEEE International Reliability Physics Symposium, pp. 763–767, 2010.
- [89] L. J. Edgar, "Method and apparatus for controlling electric currents," uS Patent 1,745,175. Jan. 28 1930.