Q&A
What are the core files required to use HeteroSTA?
To compile and run an application using HeteroSTA, you need to include the C/C++ header files (heterosta.h and netlistdb.h) in your source code. During the linking phase, you must link against the library file, libheterosta.so.
What physical units does HeteroSTA use, and what are the key considerations for placement data?
HeteroSTA's internal STA engine operates with standard industry units: femtofarads (fF) for capacitance and kilo-ohms (kΩ) for resistance.
When using heterosta_extract_rc_from_placement, there are two critical considerations:
- Pin-level Coordinates: The function requires the exact coordinates of each pin, not the origin of its parent cell. Your application is responsible for calculating the absolute pin coordinates based on the cell's placement and the pin's relative position within the cell (often found in LEF files).
- Unit Consistency: While the absolute units of your input coordinates are user-defined (e.g., microns), the unit-based RC parameters you provide must be scaled consistently to produce the required internal units. As stated in the API reference, the following relationships must hold:
unit_cap * distance = Capacitance in fF(1e-15 Farads)unit_res * distance = Resistance in kΩ(1e3 Ohms)
Does HeteroSTA preserve the pin and cell order from my input files?
It depends on how the netlist is loaded. This is a critical distinction:
- Using
heterosta_read_netlist: The library will parse the Verilog file and create its own internal ordering. You should not assume this order matches your file. To find the internal ID for a given pin name, you must use theheterosta_lookup_pinfunction. - Using
heterosta_set_netlistdb: This low-level approach gives you full control. The internal pin and cell IDs used by HeteroSTA will directly correspond to the indices you define when you construct and pass theNetlistDBobject. This is the recommended method for tight integration with tools like placers, as it eliminates the need for ID remapping.
What are the key data formatting requirements when building a NetlistDB manually?
Based on the Netlist Loading Demonstration, two key requirements are:
- IO Pins (Ports): The design's top-level input/output ports are not part of any instance. They should be associated with a special cell representing the top-level module, which by convention is the cell at index
0. - Instance Pin Naming: Pins belonging to a cell instance must use a hierarchical naming convention with a forward slash (
/) as the separator, such as<instance_name>/<pin_name>(e.g.,u1/a).
What is the typical API calling sequence?
The API is designed to be called in a logical sequence. A typical workflow, detailed in the Get Started guide, can be broken down into three phases:
-
Initialization and Setup (Called once):
heterosta_init_license(): Initialize and validate the HeteroSTA license.heterosta_init_logger(): Initialize the logger with the callback function.heterosta_new(): Create theSTAHoldingsenvironment.heterosta_set_delay_calculator_*(): Choose a delay model (e.g., Arnoldi).heterosta_read_liberty(): Load timing libraries for bothEARLYandLATEcorners.heterosta_read_netlist()orheterosta_set_netlistdb(): Load the design netlist.heterosta_flatten_all(): Finalize the design data into a high-performance format.heterosta_build_graph(): Construct the internal timing graph.heterosta_read_sdc(): Load the design constraints.
-
Timing Analysis (Called in a loop for optimization):
heterosta_extract_rc_from_placement(): Update parasitics based on new pin locations. (Alternatively,heterosta_read_spefis used for one-shot analysis).heterosta_update_delay(): Recalculate cell and net delays.heterosta_update_arrivals(): Propagate arrival times and calculate slacks. This must be called afterheterosta_update_delay.
-
Reporting and Cleanup (Called after analysis):
heterosta_report_wns_tns()orheterosta_report_slacks_at_max(): Retrieve timing results.heterosta_free(): Release all memory associated with theSTAHoldingsenvironment.
For timing-driven optimization, should I use setup slack or hold slack?
You should primarily use setup slack (heterosta_report_slacks_at_max). Setup time violations determine the maximum clock frequency of the design, making setup slack the most critical metric for performance optimization during physical design stages like placement.
When should I use heterosta_extract_rc_from_placement versus heterosta_read_spef?
- Use
heterosta_extract_rc_from_placementduring iterative, time-driven placement. In this flow, cell positions change in every iteration. This API allows you to dynamically re-estimate parasitics based on the latest layout to guide the placer. - Use
heterosta_read_speffor sign-off or post-layout analysis. When you have a static layout and a detailed parasitic file generated by an extraction tool, this API provides the most accurate results.
How can I debug a CUDA IllegalAddress error?
This error almost always indicates a memory location mismatch. When you call an API with use_cuda=true, the library expects all array pointers (e.g., xs and ys coordinates) to point to memory previously allocated on the GPU device. If you pass a pointer to standard CPU host memory, the GPU kernel cannot access that address, resulting in an IllegalAddress crash. Ensure all necessary data has been correctly transferred to the GPU before the API call.
Can I mix GPU-accelerated API calls with CPU calls for reporting?
Yes, this is a supported and recommended workflow. You can perform the computationally intensive tasks (like heterosta_update_delay and heterosta_update_arrivals) on the GPU by setting use_cuda=true. Then, for reporting functions like heterosta_report_wns_tns, you can set use_cuda=false. The library will handle the internal data transfer from GPU to CPU to generate the results.
What is the purpose of nets_zero_array and nets_one_array when building a NetlistDB?
These arrays are important for an accurate analysis. They explicitly tell the STA engine which nets are tied to a constant logic '0' (ground) and logic '1' (power). This information is crucial for correct logic propagation, identifying constant pins, and preventing the analysis of false timing paths. The Netlist Loading Demonstration provides a clear example of how these are populated.
Does HeteroSTA have requirements for instance pin naming?
Yes. HeteroSTA expects a hierarchical naming scheme using a forward slash (/).
U1/ais supported.U1:ais not supported.
Should virtual placement blockages be passed to HeteroSTA as cells?
No. The timing engine is concerned only with real, physical circuit elements that are part of a timing path (standard cells, macros, IOs). Virtual elements like placement blockages or virtual IO pins do not have timing characteristics and should be filtered out by your application before you build the NetlistDB.
My timing report shows WNS and TNS as 0.0. What's the probable cause?
A WNS/TNS of 0.0, especially on a complex design, strongly suggests that the timing engine did not find any valid, constrained timing paths to analyze. Common causes include:
- Incorrect
NetlistDBconstruction: The cell connectivity, pin directions, or clock port identification might be wrong. - Missing or incorrect SDC constraints: A clock may not have been defined with
create_clock, or I/O delays might be missing, leaving paths unconstrained. - API sequence error: Calling a reporting function like
heterosta_report_wns_tnsbefore successfully runningheterosta_update_delayandheterosta_update_arrivalswill result in a panic or incorrect zeroed values. - Incompatible Environment (GPU): When using
use_cuda=true, an incompatible CUDA driver or toolkit version can sometimes cause silent failures in the GPU kernels, leading to zeroed results being returned.