Friday, 12 April 2013

Solving Setup time violations



Somehow the datapath delay needs to be reduced to meet the setup timing requirement. If this is not possible, then the other option that the designer can try is to skew the clock so that the clock edge progresses and hence capture happens at the next clock edge.
An elaborate listing of the possible fixes are:
1)      Better placement: The timing path should be as linear as possible. There shouldn’t be any zig zag placement of the cells. This kind of haphazard placement means that the routes will be long leading to more delays.
2)      Better transitions/less capacitance: The timing path must be well structured. This means that there shouldn’t be cells driving loads seated at long distances. This leads to bad tran/increased cap on the nets and hence delay increases. At appropriate stages, buffers/repeaters have to added to ensure better tran value. Note that in latest process technologies (28nm and below), the cell delay is almost comparable to the net delay. Hence, an addition of an extra stage of buffers/repeaters reduces the effect of bad tran on the nets by half.
3)      Better optimizations: The tools are themselves intelligent to optimize the timing path better. But, in some cases the tool may not honour the timing weights and so some paths may not be optimized well. The cells may either be of very high drive strength/or poor drive strength. If the cells are of high drive strength and the preceding stage is not able to drive this load, it leads to a cap violation and so delay worsens. On the other hand if the cell is of low drive strength, then the tran on the driven net worsens and so delay increases. Hence the choice of cells should ideally be dependent on timing path criticality and placement of cells.
4)      Logic re-structuring: Tools may perform a good physical synthesis, but at times the designer himself has to restructure the logic. For example an AND gate may provide a delay of 20ps while the combination of NAND and NOT gates may give a delay of just 10ps. Some of the redundant logic added/inferred by the tool too can be eliminated. Unnecessary buffers, complex gates (like AND OR combinations for a simple mux etc) can be eliminated.
5)      Logic replication: Gates/flops can be duplicated to cater to a large fanout. For example, if there is a flopped version of a reset signal going to 100 AND gates, then the flop can be replicated (assuming that the D side has enough slack). This ensures that there are now 100 new timing paths instead of 1 and so the slack margins will improve significantly.
6)      Proper constraints: Constraints has a direct impact on placement of cells in a timing path, choice of cells by the tool during physical synthesis and the timing path group optimization. Interface paths need proper constraints to ensure the IO paths are neither over optimized or less optimized.
Path groups: The critical range and the weight set on each of the major timing path (like reg2reg, io_to_io, input_to_reg, reg_to_output) should be done carefully. If the internal timing closure is the priority, then a high weight can be set.
7)      Cell swaps: If there is difficulty in achieving any better optimization, then the cells can be swapped to lvt. With this the channel in the gates will be formed sooner and hence they switch faster, improving the delays.
8)      Addition of latches: Addition of negative level latches at before launch flop in a failing reg2reg path (both flops posedge) or positive level latch (in case of both flops being negedge) allows us to borrow half cycle and meeting setup.
What is also followed at some places is the swap of the flop failing setup with a positive/negative level sensitive latch to borrow time. But care should be taken so that data is not missed/X is propagated.

Clock skewing:
Positive skew relaxes setup so if there is enough slack on the launch clock path, then the capture clock path can be delayed, relaxing setup. This is normally followed after the CTS is optimized and one round of hold fixes is done.

AOCV derates:
The derates applied should be PVT corner specific and should be done with care.

No comments:

Post a Comment