Q41. Why do we emphasize on setup violation before CTS and hold violation after CTS?
Setup time of a valid timing path depends on: Max data network computation time vs clock edge arrival time at the sink. Until POST CTS stage, we assume all clocks as ideal networks and it could reach in 0 time to every possible clock sink of the chip!!What we need to focus is in implementing the data path in such a way that it should at least not take more than one single clock period of the clock from start point to end point. (assuming a full cycle valid timing path). And out of the two components of the setup timing check, one is always a constant (time period of the clock) and the other variable is data path delay which we have all the options to play with till CTS stage completes. If we can’t meet this stretch goal before CTS, there is going to be a hard time in closing the timing later. Hence until CTS stage, we focus on getting data path synthesis or data network physical implementation alone.
I hope it is clear why we focus on setup timing before CTS stage.
Let’s see in the other view, why don’t we just focus on hold time ?
Hold time of a path depends on : minimum data path delay vs clock edge time. Since Clock reaches in zero time every sink of the chip and at the minimum, data path delay will be always greater than the hold req of a flop/timing path end point. So that’s it, unless if there is going to be a change in the clock path network delay, there is no point of analyzing hold timing of a valid path right? (But at bare minimum, one can review the gross hold timing paths just to see if it’s a FP/MCP.)
Q42. What should we do if there is a setup violation after placements even though we completed the optimization?
Setup violation after placement is nothing to worry about. Well, unless it comes from a really bad placement of modules. Have a look at the macro placement and the module placement and see if something looks bad. For example, if there’s a module for instruction fetch and it's getting split and placed in two or three different clusters then we may want to attack this with module placement guides or bounds.
Let the tool have the right constraints at place stage, and maybe give another round with timing effort flag marked as high.
Let more rounds of optimization from CTS and routing stages happen. Each of these will try to revisit this issue and will have some improvements to make. I have seen some bad slacks like -500 ps and 30000 plus paths failing, but these are aggressively dealt with by timing team in STA. (Utilizing Stuff like upsizing, fixing max cap, max fanout and max transition, putting in lvt cells)
**Additional note : the routing engine and timing engine used in place stage are not signoff quality, and do not come anywhere close to what a tool like tempus or primetime can evaluate.
Q43. What is meant by insertion delay in VLSI physical design?
- the insertion delay concept comes into picture in clock tree synthesis.
- While building the clock tree, cts starts building the clock from the clock source to the sinks.
- Once The clock was build and now the clock signal has to travel from the source to the sinks. The amount of time taken by the clock signal to travel from source to sinks is called the insertion delay.
Ex:
At point A the clock source was there, so clock started building from point A it has to reach the sinks (flops) points B,C,D (flops) . So from point A to point B C D the clock signal has to travel. But in between it will build some logic to balance all three sinks because signal has to reach 3 sinks B C D at a time it is called Skew Balancing (main aim of CTS)
The amount of time taken by clock signal from point A to B C D is called insertion delay.
You can refer to LATENCY concepts for more in depth information.
Q44. Why don't we do routing before CTS in VLSI Physical Design?
- Routing should be done once your design is at a stage where all of your data and clock logical nets are balanced and synthesized properly. Laying down the actual metal routing requires all of the design objects (cells) to be placed at legal sites. Post placement stage is when we reach to that. But it doesn't mean that your design is ready for routing, you should consider other high fanout nets and clock network signals post placement. Till this stage clocks are ideal networks (assuming can drive any number of loads without any buffering).
- During logic synthesis we do not balance HFN and Clock nets, so a single clock port might be driving thousands of flops (with a vIRtual route even after placement). CTS is the stage where this kind of loading is synthesized into a balanced tree to arrive with a min skew and latency for all sinks (flops).
- Until you finish logical synthesis of clocks, you are not allowed to route anything. As soon as you finish up with CTS, you can start routing the design clocks first followed by data signals. Let me know if any clarifications are required.
Q45. What is a path group in VLSI, and why is it done?
- As the name indicates, it is a group of paths.
- The reason why paths are grouped is to guide the efforts of the synthesis engine.
- for e.g. Let us assume that you start with all paths in a single path group.
- In this case the synthesis engine will spend most of its time on optimizing the logic of the worst case violators. and once it meets timing will move on to the next worst case violator and so on.
- Now looking at the initial timing report you might have identified
- Some path’s that need an architectural change (e.g. cascade of adders/multipliers to be replaced by pipelined logic) so you do not want the synthesis engine to spend much time on optimizing this logic. Make it a separate path group with lower priority
- Low violation Paths that did not get optimized because all effort was spent on high violation paths. Make separate path groups of these two sets.
Q46. What is the benefit of having separate path groups for I/O logic paths in VLSI?
- Path groups formed the fundamental of optimization function in tools doing synthesis and PnR. Now the more realistic path groups make easier for tool to attain optimal for in all respects.
- Now most of the times our I/O constraints are budgeted and cannot be actual. Also they might not be clean from clock domain prospective. So they might impact the qor if they are kept in same group as internal paths. Also the tool work on critical most path and tries to optimize to below certain range called critical range. If an IO path came as most critical path than tool might not work on internal paths and hence sub optimal design.
Q47. While fixing timing, how do I find a false path in VLSI design?
False path is a very common term used in STA. It refers to a timing path which is not required to be optimized for timing as it will never be required to get captured in a limited time when excited in normal working situation of the chip. In normal scenario, the signal launched from a flip-flop has to get captured at another flip-flop in only one clock cycle. However, there are certain scenarios where it does not matter at what time the signal originating from the transmitting flop arrives at the receiving flop. The timing path resulting in such scenarios is labeled as false path and is not optimized for timing by the optimization tool.
Q48. What makes meeting timing on clock gating paths very challenging? What makes it more critical than a regular setup/hold flop to flop timing path?
- While building clock tree, we try to balance all the flops. This makes the clock gate (CG) driving bunch of flops early in clock tree by delay of the CG itself. This makes the available time to meet setup for clock gating latch clock period minus delay, and hence making it tighter to meet.
- Now if the fanout of cg is more than it's driving capability than a small bigger tree (or may be 2 parallel buffers) will come, making arrival of Clock at CG even early and hence making meeting setup more difficult.
Q49. What is the difference between a static IR drop and a dynamic IR drop analysis?
Static IR drop is the voltage drop, when a constant current draws through the power network with varying resistance. this IR drop occurs when the circuit is in steady state. Dynamic IR drop is the drop when the high current draws the power network due to the high switching of the cell data. Due reduce static, you should increase width of the power network, or a robust power grid has to be designed, where as to reduce Dynamic IR drop, reduce the toggle rate or place decap cells near high switching cells
Q50. What is the need of Static IR drop analysis?
IR drop is the voltage drop in metal wires from the power grid before it reaches the VDD pins of standard cells. Due to the IR drop, there can be timing issues due to the change in VDD value.