0% found this document useful (0 votes)
6 views11 pages

Debugging Low Test

The document discusses debugging low test-coverage situations in integrated circuit testing, focusing on the use of scan techniques and automatic test pattern generation (ATPG) tools to improve defect coverage. It highlights the importance of understanding fault categories, particularly ATPG_untestable faults, and the challenges associated with low coverage reports. Additionally, it emphasizes the benefits of automating the debugging process to identify and quantify coverage issues effectively.

Uploaded by

kondalarao79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views11 pages

Debugging Low Test

The document discusses debugging low test-coverage situations in integrated circuit testing, focusing on the use of scan techniques and automatic test pattern generation (ATPG) tools to improve defect coverage. It highlights the importance of understanding fault categories, particularly ATPG_untestable faults, and the challenges associated with low coverage reports. Additionally, it emphasizes the benefits of automating the debugging process to identify and quantify coverage issues effectively.

Uploaded by

kondalarao79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Debugging Low Test-Coverage Situations

Scan is a structured test approach in which the overall function of an integrated circuit (IC) is broken into

smaller structures and tested individually. Every state element (D flip-flop or latch) is replaced with a scan

cell that operates as an equivalent state element and is concatenated into long shift registers called “scan

chains” in scan mode. All the internal state elements can be converted into controllable and observable

logic. This greatly simplifies the complexity of testing an IC by testing small combinational logic segments

between scan cells. Automatic test pattern generation (ATPG) tools take advantage of scan to produce high-

quality scan patterns.

The combination of scan and ATPG tools has been shown to successfully detect the vast majority of

manufacturing defects. When you use an ATPG tool, your goal should be to achieve the highest coverage of

defects as possible. Because high test coverage directly correlates to the quality of the parts shipped, many

companies demand that the coverage for single stuck-at faults be at least 99% and transition delay faults be

at least 90%.

When the coverage report falls short of these goals, your task is to figure out why the coverage is not high

enough and perform corrective actions where possible. Debugging low defect coverage historically requires

a significant amount of manual technique and intimate knowledge of the ATPG tool, as well as design

experience especially when device complexity increases.

Automating more of the debug process during ATPG greatly simplifies this effort. I have seen some cases in

which automation saved hours, even days, of manual debugging effort and other cases in which the tool

provided answers when no feasible, manual technique was possible. Before exploring why you might be

getting low coverage and why further automation is needed, I’ll explain how ATPG tools in general

categorize and report different categories of faults.

INTERPRETING THE MYSTERIES OF ATPG STATISTICS

The ATPG tool generates a “statistics report” that tells you what the tool has done and provides the fault

category information that you have to interpret to debug coverage problems. If you’re an expert at using an

ATPG tool, you’ll probably have little problem understanding the fault categories listed in the statistics

report. But if you’re not a design-for-test expert, this data may as well be written in hieroglyphics (Fig. 1).

Although the statistics report contains a lot of information, it can be difficult to interpret and rarely gives

enough useful information to determine the reasons for low coverage, even for an ATPG expert.
When debugging low coverage, you’ll need to understand some of the basic fault categories that are listed

in most typical ATPG statistics reports. The first and broadest category is what is sometimes referred to as

the “fault universe.” This is the total number of faults in a design. For example, when dealing with single

stuck-at faults, you have two faults for each instance/pin, stuck_at logic 1 and stuck_at logic 0, where the

instance is the full hierarchical path name to a library cell instantiated in the design netlist.

This number of total faults really is only important when comparing different ATPG tools against each

other. The total number can vary if “internal” faulting is turned on and whether or not “collapsed” faults

are used. Internal faulting extends the fault site down to the ATPG-model level, rather than limiting it to the

library-cell level. ATPG tools, for efficiency purposes, are designed to collapse equivalent faults whenever
possible. Typically, you’ll want to have the internal faults setting turned off and uncollapsed faults setting

turned on. These settings most closely match the faults represented in the design netlist.

SHOULD YOU CARE ABOUT UNTESTABLE/UNDETECTABLE FAULTS?

Faults that cannot possibly be tested are reported as untestable or undetectable. This includes faults that

are typically referred to as unused, tied, blocked, and redundant. For example, a tied fault is one in which

the designer has purposely tied a pin to logic high or logic low. If a stuck-at-1 defect were to occur on a pin

that is tied high, you could not test for it because that would require the tool to be able to toggle the pin to

logic low. This cannot be done because of the design restriction, so the fault is categorized as “untestable.”

Untestable/undetectable faults are significant for two reasons. First, they distinguish “fault coverage” from

“test coverage,” both of which are reported by ATPG tools. When most tools calculate coverage, fault

coverage includes all the faults in the design.

Test coverage subtracts the untestable/undetectable faults from the total number of faults when calculating

coverage. For this reason, the reported number for test coverage is typically higher than fault coverage.

The second reason that untestable/undetectable faults are important is that nothing can be done to improve

the coverage of these faults; therefore, you should direct your debugging efforts elsewhere.

One last thing to be aware of regarding untestable/undetectable faults is that ATPG-tool vendors vary in

how they categorize these faults. These differences can result in coverage discrepancies when comparing

the results of each tool.

WHAT IS MORE IMPORTANT—TEST COVERAGE OR FAULT COVERAGE?

This begs a question as to which is the more critical figure: test coverage or fault coverage? Most

engineers, but not all, rely on the higher test coverage number. The justification for ignoring

untestable/undetectable faults is that any defect that occurs at one of those fault locations will not cause

the device to functionally fail. For example, if a stuck-at 1 defect occurred on a pin that is tied high by

design, the part will not fail in functional operation. Others would argue that fault coverage is more

important because any defect, even an untestable defect, is significant because it represents a problem in

the manufacturing of the device. That debate won’t be explored here though.

Some faults are testable, meaning that a defect at these fault sites would result in a functional failure.

Unfortunately, ATPG tools cannot produce patterns to detect all of the testable faults. These testable but

undetected faults are called “ATPG_untestable” (AU).

Of all the fault categories listed in an ATPG statistics report, AU is the most significant category that

negatively affects test coverage and fault coverage. Determining the reasons why ATPG is unable to
produce a pattern to detect these faults and coming up with a strategy to improve the coverage is the

biggest challenge to debugging low-coverage problems.

Here are some of the most common reasons why faults may be ATPG_untestable:

Pin constraints: At least one input signal (usually more than one) is required to be constrained to a constant

value to enable test mode. While this constraint makes testing possible, it also results in blocking the

propagation of some faults because the logic is held in a constant state. Unless you have special knowledge

to the contrary, these pin constraints must be adhered to, which means you cannot recover this coverage

loss.

Determining the effect on coverage loss is not as simple as counting the number of constrained faults on

the net. The effect on defect coverage also extends to all the logic gates that have an input tied and

whatever upstream faults are blocked by that constraint. Faults downstream from the tied logic have

limited control, which further affects coverage.

Black-box models: When an ATPG model is not available for a module, a library cell, or more commonly a

memory, ATPG tools treat them as “black boxes,” which propagates a fixed value (often an “X” or unknown

value). Faults in the “shadow” of these black boxes (i.e., faults whose control and observation are affected

by their proximity to the black box), will not be detected. This includes faults in the logic cone driving each

black-box input as well as the logic cones driven by the outputs. Obtaining an exact number of undetected

faults is complicated by the fact that some of those faults may also be in other overlapping cones

that are detected. The solution is to ensure that everything is modeled in the design.

Random access memory: In the absence of either bypass logic or the ability to write/read through RAMs,

faults in the shadow of the RAM may be undetected. Similar to black-box faults, it is difficult determine

exactly which faults are not detected because of potentially overlapping cones of logic.

If you make design changes, adding bypass logic may address this problem. Some ATPG tools are capable

of special “RAM-sequential” patterns that can propagate faults through memories so long as the applicable

design rule checks (DRCs) are satisfied. This may be an option to get around having to modify the design to

improve coverage.

Cell constraints: Sometimes you need to constrain scan cells with regard to what values they are capable of

loading and capturing (usually for timing-related reasons). These constraints imposed on the ATPG tool will

prevent some faults from being detected. If the cell constraint is one that limits capturing, then to

determine the effect, you’ll need to look at the cone of logic that drives the scan cell and sift out faults that

are detected by overlapping cones.


If found early enough in the design cycle, the underlying timing issue can possibly be corrected, which

makes cell constraints unnecessary. However, this type of timing problem is often found too late in the

design cycle to be changed. Using cell constraints is a bandage approach to getting patterns to pass, and

the resulting test coverage loss is the price to be paid.

ATPG constraints: You may impose additional constraints on the ATPG tool to ensure that certain areas of

the design are held in a desired state. For example, let’s say you need to hold an internal bus driving in one

direction. As with all types of constraints, parts of the design will be prevented from toggling, which limits

test coverage. Similar to pin constraints, if the assumption is that these are necessary for the test to work,

the coverage loss cannot be addressed.

False/multicycle paths: Some limitations to test coverage are specific to at-speed testing. False paths

cannot be tested at functional frequencies; therefore, ATPG must be prevented from doing so to avoid

failures on the automatic test equipment. Because transition-delay fault (TDF) patterns use only one at-

speed cycle to propagate faults, multicycle paths (which require more than one cycle) must also be masked

out. Determining which faults are not detected in false paths is complicated by the manner in which false

paths are defined.

Delay-constraint files usually specify a path by designating “-from”, “-to” and possibly “-through” to

describe a start and end point of the path. In between those points, there can be a significant amount of

logic to trace and potentially multiple paths if you don’t use “-through” to specify the exact path.

STEPS TO IDENTIFY AND QUANTIFY COVERAGE ISSUES

There are three aspects of the debug challenge:

 How you identify which coverage issues (as described above) exist,

 How you determine the effect each issue has on the coverage, and

 What, if anything, you can do to improve the coverage.

Typically, we have had to rely on a significant amount of design experience as well as ATPG tool proficiency

to manually determine and quantify the effects of design characteristics or ATPG settings that limit

coverage. The usual steps that are required to manually debug fault coverage are:

1. Identify a common thread in the AU faults.


2. Investigate a single representative fault.
3. Rely on your experience to recognize trends.
4. Determine the effect of the issue on test coverage.
Let’s look at these steps in turn. When it comes to identifying a common thread in the AU faults, it is

extremely difficult, if not impossible, to identify a single problem by looking at a list of AU faults. You have

to recognize trends in either the text listing of faults or graphical view of faults relative to the design

hierarchy. For example, a long list of faults that are obviously contained in the design hierarchy of the

boundary scan logic may be caused by a single problem.

At some point, you’ll need to focus your analysis efforts on one fault at a time, so pick one you think might

represent a larger group of faults. You might zero in on design elements like registers or memories, but this

is usually based more on intuition than anything else. ATPG tools have different reporting capabilities that

can be used to report on the inherent controllability and observability of a fault location, which can help but

often provide limited information. Interpreting the reports at this level requires an in-depth knowledge of

the ATPG tool’s capabilities and a fair amount of instinct regarding where to focus efforts.

As is often the case, your success with debugging relies on having been through the process and identifying

similar situations. For example, if a significant number of boundary scan faults are listed as AU, this may be

an indication that the boundary-scan logic has been initialized to a certain desired state and must be held

in that state to operate properly. Making connections like this between the trends you identify in the list of

AU faults to what you know about designs and design practices in general requires a fair amount of

experience.

Once an issue is identified, how you determine its significance will be different depending on the issue. As

previously described, you often need to keep track of backward and forward cones of logic fanning out from

a single constrained point to determine the potential group of affected faults. From there, you also need to

evaluate each of those potential faults to assess if it is possibly observed in another overlapping cone of

logic.

Some other possible techniques can approximate the effect of some issues. For pin constraints, it may be

possible to have the tool temporarily treat them like a tied-untestable fault so that coverage can be

recalculated and compared to the original coverage number. Whole design modules can be no-faulted (for

example, memory built-in self-test \\[MBIST\\] logic) to see the difference in coverage.

All of these approaches require a combination of special scripts to trace logic paths backward and/or

forward, multiple runs of the ATPG tool with different settings, and a high level of tool expertise. Even then,

the actual effect is usually still based on an approximation.

AUTOMATED DEBUG ANALYSIS

Recently, ATPG tools have been improved to automatically identify issues that affect test coverage and

quantify just how much each issue affects the coverage. The most common method to display this
information is through a modified version of the traditional statistics report that you can access in the

command line mode of ATPG tools. Mentor Graphics’ ATPG tools FastScan and TestKompress are used as

an example here to demonstrate what’s available for automated analysis of low test coverage.
Without any additional ATPG tool runs or any of the manual debug steps, the new statistics report

automatically provides details about coverage issues (Fig. 2). Note the list of the total number of

uncollapsed faults in the design, which is then broken down into various ATPG categories (Fig. 2, arrow

#1). The percentage listed within the parentheses is based on the total number of faults.

The next important area of the report is the test coverage achieved by the patterns generated (Fig. 2, arrow

#2). In this case, the coverage is 83.67%, which may not be acceptable. If that test coverage is

unacceptable, the next place to look is the line in the statistics report that indicates the number of

atpg_untestable or AU (Fig. 2, arrow #3). This line points out that 57,563 faults (or 14.56% of the total

number of faults) are AU.

Up to this point, the information is very typical of what you would find in a traditional report. Moving down

to the “Untested Faults” section (Fig. 2, arrow #4), you can now get a detailed breakdown of which AU

categories have a significant effect on test-coverage loss. The first most significant category of test

coverage loss is TC or tied cells (Fig. 2, arrow #5). This category of AU faults accounts for 4.46% of the

total number of faults. In this case, “tied cells” refers to registers that are tied to a particular state as a

result of the ATPG tool having performed DRCs and simulating an initialization or “test_setup” procedure.

The report also lists the most significant individual tied cells (as well as the state to which they are tied), so

that you may evaluate the severity of effect on test coverage at a fine level of detail. A quick review of the

instance path names of these tied cells suggests that it’s all test-related logic (boundary scan and

MBIST).Although you must still perform additional manual analysis to determine if this category of AU

faults can be reduced, this report gives a clear indication of where to look in the design. If it is determined

that nothing can be corrected because the test mode requires this logic to be tied, then at least you will be

able to explain why 4.46% of the faults will remain untestable.

The next significant category of AU faults is FP or “false_path” faults (Fig. 2, arrow #6). This transition-

fault pattern set includes a definition of false paths so the coverage will be lower. From this report, you can

see that 5.37% of the faults cannot be tested because of the false path definitions. Many test engineers

believe that test coverage should not be penalized as a result of false paths because they are functionally

false paths that, by definition, cannot be tested at-speed.

A relatively significant number of multicycle-path faults (1.01%) hurt the test coverage (Fig. 2, arrow #7).

Given this information, you may choose to address these faults by targeting them with another pattern set

using a clock cycle that will exercise them at a lower frequency. There is no guarantee that all of these

faults will be detected at a different frequency because other issues may prevent detection. What the report

tells you is that these definitely can not be tested because of the reason listed. This is true for all the

categories.
The SEQ (sequential_depth) category (Fig. 2, arrow #8) refers to faults that cannot be detected because

the sequential depth of the ATPG tool has not been set high enough. This implies that there may be some

non-scan logic or memories that require an increased sequential depth to propagate and detect faults. You

can affect this number by changing some of the settings during pattern generation.

Right after the SEQ category is another category called “Unclassified.” This is a group of faults that does

not fall into any of the pre-defined categories that the ATPG tool can determine. They are faults that

traditional statistics reports would normally indicate as AU—there’s just no additional detailed analysis

available to determine why they are AU. These faults will require manual analysis.

I previously mentioned that many test engineers do not believe false path faults should be included in the

calculation of test coverage while others do. To satisfy these differing requirements, a new column of

information called “total relevant” has been added to the statistics report (Fig. 2, arrow #9).

Faults that were not considered relevant were deleted, which resulted in the lower number of total faults

(374,238) as compared to the total number of faults in the neighboring column (395,480). How can you tell

which faults were detected from the relevant coverage calculation? If you trace down the “Total Relevant”

column of information, you will eventually see the word “deleted” corresponding to the false-path category.

This means that the 21,242 false-path faults were deleted from the total relevant faults, and the coverage

was recalculated. The relevant coverage was 88.46% as compared to 83.67% (Fig. 2, arrow #10). You can

see both coverage numbers side by side and determine which one should be used.

Another way to slice the coverage information is to view it with respect to the clock domains (Fig. 2, arrow

#11). The next column to the right indicates what percentage of the total number of faults is covered by

that clock domain (e.g., 58.71% of the faults in the design are in the clk1 clock domain).

The next column indicates the test coverage of that clock domain’s fault population. In this case, 94.88% of

the clk1 faults were detected. The point in listing both the percentage of total faults and percentage

coverage of each clock domain is so that you can investigate low coverage for clock domains that represent

a significant percentage of the design. Additional reporting capability is available so that a detailed analysis

of the AU faults can be shown for the fault universe of each individual clock domain.
Some tools provide more graphical means of viewing this information relative to design hierarchies as well

as the design’s clock domains. In addition to the traditional statistics report viewed on the tool’s command

line, you can look at the coverage analysis graphically. An example shows how the AU analysis categories

can be displayed relative to the design hierarchy (Fig. 3, top left panel). The bottom panel displays the

same statistics report as shown on the command line, but design instances are hyperlinked so that you can

bring up the schematic view of that instance (Fig. 3, top right panel). You can also overlay the fault

category information on the schematic view. The example shown here is the same one discussed earlier in

which boundary-scan logic is tied because of the initialization procedure, which resulted in a loss of 0.24%

test coverage.

The additional information provided in detailed statistics reports like this provides valuable insight into how

to identify and address potential test coverage issues. Debug automation in an ATPG tool means that the

most significant test-coverage issues are quickly highlighted along with the effect on coverage. In many

cases (such as pin constraints and tied cells), you will be able to immediately determine that nothing can be

done to fix the issue and you can easily determine what the test-coverage ceiling will be.
Further automation within the ATPG tool eliminates significant manual effort and debug time required to

sift through an otherwise nonsensical listing of untestable faults. As a result, you are freed to focus on the

task of resolving the identified problems.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy