Get Started with HeteroSTA

Introduction

HeteroSTA is a high-performance, GPU-accelerated static timing analysis (STA) engine designed for both standard verification and integration into custom EDA tools. This guide will walk you through the core concepts of a standard STA workflow, explaining the different API choices available at each step.

We will cover two primary usage models:

  1. Standard Sign-off STA: A complete workflow for analyzing a design from Verilog and SPEF files.
  2. Integration with an Optimization Engine: A workflow for using HeteroSTA to guide a tool like a placer by providing raw timing data.

Prerequisite: License Initialization

Before you begin any STA workflow, you must initialize and validate the HeteroSTA license. This must be the very first API call. If the license is not successfully initialized, the subsequent call to heterosta_new() will fail by returning NULL, preventing any further interaction with the library.

You can obtain a license key by following the instructions on our getting started page.

  • API: heterosta_init_license()

Best Practice: Managing Your License Key

For better security and flexibility, we recommend storing your license key in an environment variable rather than hardcoding it. This makes your program easier to deploy and migrate across different environments.

1. Setting the Environment Variable

  • Temporarily (for the current shell session):

    export HeteroSTA_Lic="lic:heterosta:your-license-key-string-here"
  • Permanently (for all future sessions):
    Add the above line to your shell's configuration file (e.g., ~/.bashrc, ~/.zshrc).

2. Acquiring the License Key in Your Code

Here is an example of how to read the license from the environment variable, with a fallback to a hardcoded value for development or testing:

// Optional hardcoded fallback license (use with caution).
const char* hardcode_lic = "lic:heterosta:your-fallback-license-key";
 
// 1. Acquire license, prioritizing the environment variable.
const char* lic = std::getenv("HeteroSTA_Lic");
if (lic == nullptr) {
    if (hardcode_lic == nullptr) {
        std::cerr << "[FATAL ERROR] License not found in environment variable 'HeteroSTA_Lic' and no fallback is provided." << std::endl;
        return 1;
    }
    lic = hardcode_lic;
    std::cout << "[INFO] 'HeteroSTA_Lic' environment variable not found. Using hardcoded license." << std::endl;
} else {
    std::cout << "[INFO] Successfully loaded license from 'HeteroSTA_Lic' environment variable." << std::endl;
}
 
// 2. Initialize the license.
bool license_ok = heterosta_init_license(lic);
if (!license_ok) {
    std::cerr << "[FATAL ERROR] Failed to initialize HeteroSTA license." << std::endl;
    return 1;
}
 
// 3. Proceed with library initialization and workflow.

The Standard STA Workflow

Any timing analysis follows a logical sequence of operations. This section breaks down the essential steps and explains the different APIs HeteroSTA provides for each, allowing you to choose the approach that best fits your application.

Step 1: Initialize the Environment

Every session begins with creating an STAHoldings environment. This object serves as the central context for all STA operations. It is also best practice to initialize the logger to receive feedback from the engine.

  • APIs: heterosta_init_logger(), heterosta_new(), heterosta_free()

Example:

#include "heterosta.h"
 
// Define a C-compatible logger callback.
extern "C" void cpp_log_callback(uint8_t level, const char* message) {
    const char* level_str;
    switch (level) {
        case 1: level_str = "ERROR"; break;
        case 2: level_str = "WARN "; break;
        case 3: level_str = "INFO "; break;
        case 4: level_str = "DEBUG"; break;
        case 5: level_str = "TRACE"; break;
        default: level_str = "UNKNW"; break;
    }
    if (level <= 2) { // ERROR and WARN to stderr
        std::cerr << "[" << level_str << "] " << message << std::endl;
    } else { // INFO, DEBUG, TRACE to stdout
        std::cout << "[" << level_str << "] " << message << std::endl;
    }
}
 
// Create and initialize the STA environment.
heterosta_init_logger(cpp_log_callback);
STAHoldings* sta = heterosta_new();
 
// ... perform all analysis ...
 
// Free the environment at the end.
heterosta_free(sta);

Advanced: Multi-GPU Configuration

For systems with multiple GPUs, HeteroSTA provides specific functions to manage hardware resources and isolate performance:

  • Check Availability: Use heterosta_get_num_cuda_devices() to retrieve the count of detected CUDA devices.
  • Specific Initialization: Use heterosta_new_with_device(id) instead of heterosta_new() to create an environment directly bound to a specific GPU ID (e.g., 0 or 1).
  • Dynamic Assignment: Use heterosta_set_cuda_device_id(sta, id) to assign or change the GPU for an existing STA instance.

Step 2: Select a Delay Model

You must choose a delay calculator, which determines the accuracy and performance of the analysis. This choice is made once per session. The Arnoldi calculator is generally recommended for its balance of accuracy and speed.

  • APIs: heterosta_set_delay_calculator_elmore(), heterosta_set_delay_calculator_arnoldi(), etc.

Example:

// Select the Arnoldi delay calculator for high accuracy.
heterosta_set_delay_calculator_arnoldi(sta);

Step 3: Load Timing Libraries

Before the netlist can be understood, HeteroSTA needs the characterization data for the standard cells, which is provided in Liberty (.lib) files. You must load libraries for both EARLY (min/hold) and LATE (max/setup) timing corners.

  • APIs: heterosta_read_liberty() (for single files), heterosta_batch_read_liberty() (for multiple files in parallel)

Example:

// Load separate library files for both corners.
heterosta_read_liberty(sta, EARLY, "fast.lib");
heterosta_read_liberty(sta, LATE, "slow.lib");

Step 4: Load the Design

Provide the circuit's logical structure and physical parasitics.

Loading the Netlist

  • Method A: From a Verilog File This is the standard method for sign-off STA. It handles file parsing and database population in one call.

    • API: heterosta_read_netlist()
    • Example: heterosta_read_netlist(sta, "design.v", "top_module_name");
  • Method B: From an In-Memory Database This method is for integration with tools like placers that already have a netlist in memory(via the NetlistDB struct). It provides fine-grained control over pin and cell IDs.

    • API: heterosta_set_netlistdb()
    • Example:
      // 1. Parse Verilog into custom C++ database objects.
      db::FlatPlaceDB flatdb = create_from_custom_data();
      // 2. Build the NetlistDB object.
      NetlistDB* netlistdb = flatdb.build_netlistdb(false);
      // 3. Set it in the STA environment.
      heterosta_set_netlistdb(sta, netlistdb);
  • A detailed demonstration of building a NetlistDB from scratch is available in the Netlist Loading Demonstration.

Loading RC Parasitics

To accurately model signal propagation time, the delay calculator needs the resistance (R) and capacitance for each net.

  • Method A: From a SPEF File This is the standard approach for accurate, post-layout STA, reading parasitics from a SPEF file.

    • API: heterosta_read_spef()
    • Example: heterosta_read_spef(sta, "design.spef");
  • Method B: Estimation from Placement Used during physical design when a SPEF is unavailable. It estimates parasitics from pin coordinates using internal Steiner tree algorithms (FLUTE or PDR).

    • API: heterosta_extract_rc_from_placement()
    • Example: heterosta_extract_rc_from_placement(sta, xs, ys, 0.10, 0.10, 0.05, 0.05, 0.0, 8, 0.3, 0, use_cuda);
  • Method C: From an In-Memory Database Analogous to heterosta_set_netlistdb, this method allows for setting RC parasitics directly from an in-memory data structure (via the FlattenedParasiticsFFI struct). This bypasses the need for intermediate SPEF files, making it ideal for tight integration with external extractors or custom routing tools.

    • API: heterosta_build_flatten_parasitics()
    • Example: heterosta_build_flatten_parasitics(sta, &paras_ffi, use_cuda);

Step 5: Prepare the Timing Graph

After all design data is loaded, it must be finalized into a performance-optimized format and used to construct the timing graph. This is a mandatory step before analysis can proceed.

  • APIs: heterosta_flatten_all(), heterosta_build_graph()

Example:

// Finalize the loaded data. This is a one-way operation.
heterosta_flatten_all(sta);
 
// Construct the internal timing graph for analysis.
heterosta_build_graph(sta);

Step 6: Apply Constraints and Run Analysis

With the graph built, apply timing constraints from an SDC file and run the core analysis functions. The sequence of these calls is critical.

  • APIs:
    • heterosta_read_sdc(): Loads constraints like clocks and input/output delays.
    • heterosta_update_delay(): Calculates delays for all cell and net arcs. Must be called before update_arrivals.
    • heterosta_update_arrivals(): Propagates arrival times through the graph to determine slack.

Example:

// Apply timing constraints.
heterosta_read_sdc(sta, "design.sdc");
 
// Run the core STA calculations in order.
heterosta_update_delay(sta, false);
heterosta_update_arrivals(sta, false);

Step 7: Retrieve Timing Results

Finally, retrieve the results as either human-readable reports or as raw data arrays for other tools.

Method A: Formatted Text Reports

This method generates files summarizing timing, including WNS/TNS and critical path reports.

  • APIs: heterosta_report_wns_tns_max/min(), heterosta_dump_paths_max/min_to_file()

Example:

#include <stdio.h>
float wns, tns;
heterosta_report_wns_tns_max(sta, &wns, &tns, false);
printf("WNS: %.3f, TNS: %.3f\n", wns, tns);

Method B: Raw Data Arrays

This method provides direct access to the raw slack values for every pin, ideal for guiding optimization engines.

  • API: heterosta_report_slacks_at_max/min()

Example:

#include <stdlib.h>
 
/* Assume 'num_pins' is known. */
float (*slacks)[2] = (float (*)[2])malloc(num_pins * sizeof(float[2]));
heterosta_report_slacks_at_max(sta, slacks, false);
 
/* Use the raw slack data, e.g., for pin 5 */
float pin_5_rise_slack = slacks[5][0];
free(slacks);

Method C: Detailed Path Collection

This method returns a structured object containing detailed information for the most critical paths, including pin sequences, arrival times, and slacks. This allows external applications to programmatically traverse and analyze specific timing paths.

  • APIs: heterosta_report_paths(), heterosta_free_pba_path_collection()

Example:

#include <stdio.h>
 
/* Collect top 100 setup violations (max paths) with slack < 0.0 */
struct PBAPathCollectionCppInterface* paths = heterosta_report_paths(sta, 100, 1, true, 0.0f, true, false, false);
 
/* Iterate through collected paths */
for (uintptr_t i = 0; i < paths->num_paths; ++i) {
    float path_slack = paths->slacks[i];
    uintptr_t start_idx = paths->path_st[i];
    uintptr_t end_idx = paths->path_st[i+1];
 
    printf("Path %lu (Slack: %.3f): ", i, path_slack);
    
    /* Iterate through pins in this path using CSR indices */
    for (uintptr_t k = start_idx; k < end_idx; ++k) {
        uintptr_t pin_rf = paths->pin_rfs[k];
        uintptr_t pin_id = pin_rf >> 1; // Extract pin ID by shifting out rise/fall bit
        if (k < end_idx - 1)
            printf("%lu -> ", pin_id);
        else
            printf("%lu\n", pin_id);
    }
}
 
/* Always free the collection to prevent memory leaks */
heterosta_free_pba_path_collection(paths);

Use Case 1: Standard Sign-off STA Flow

Goal: To run a complete STA on a design described by Verilog and SPEF files and generate a timing report.

Workflow: This example combines the "Method A" choices from the conceptual steps into a single, linear program.

Example:

#include "heterosta.h"
#include <stdio.h>
 
int main() {
    // 1. Initialize STA Environment
    const char* lic = "lic:heterosta:your-license-key-string-here";
    bool license_ok = heterosta_init_license(lic);
    heterosta_init_logger(my_logger);
    STAHoldings* sta = heterosta_new();
    heterosta_set_delay_calculator_arnoldi(sta); // Set delay model
 
    // 2. Load Libraries
    heterosta_read_liberty(sta, EARLY, "early.lib");
    heterosta_read_liberty(sta, LATE, "late.lib");
 
    // 3. Load Design
    heterosta_read_netlist(sta, "design.v", "top_module");
    heterosta_read_spef(sta, "design.spef");
 
    // 4. Prepare Graph
    heterosta_flatten_all(sta);
    heterosta_build_graph(sta);
 
    // 5. Apply Constraints & Run Analysis
    heterosta_read_sdc(sta, "design.sdc");
    heterosta_update_delay(sta, false);
    heterosta_update_arrivals(sta, false);
 
    // 6. Report Results
    float wns, tns;
    heterosta_report_wns_tns_max(sta, &wns, &tns, false);
    printf("WNS: %.3f, TNS: %.3f\n", wns, tns);
    heterosta_dump_paths_max_to_file(sta, 100, 5, "report.txt", false);
 
    // 7. Clean up
    heterosta_free(sta);
    return 0;
}

Use Case 2: Integration with an Optimization Engine

Goal: To use HeteroSTA's timing capabilities to guide an external tool like a placer.

Workflow: This example uses the "Method B" choices (in-memory netlist, estimated parasitics, raw slack data) inside a conceptual optimization loop.

Example:

// 1. Initialize and Configure STA Environment
const char* lic = "lic:heterosta:your-license-key-string-here";
bool license_ok = heterosta_init_license(lic);
heterosta_init_logger(my_logger);
STAHoldings* sta = heterosta_new();
heterosta_set_delay_calculator_arnoldi(sta); // Set delay model
heterosta_read_liberty(sta, 0, "early.lib");
heterosta_read_liberty(sta, 1, "late.lib");
 
// 2. Build and Load Custom Netlist (using the demo's C++ helpers)
db::FlatPlaceDB flatdb = create_from_placer_db();
NetlistDB* netlistdb = flatdb.build_netlistdb(false);
heterosta_set_netlistdb(sta, netlistdb);
 
// 3. Prepare Graph and Load Constraints
heterosta_flatten_all(sta);
heterosta_build_graph(sta);
heterosta_read_sdc(sta, "design.sdc");
 
// 4. Enter Optimization Loop
for (int i = 0; i < NUM_ITERATIONS; ++i) {
    // 4.1 Get current pin coordinates from placer
    float* pin_coords_x = my_placer->get_x_coords();
    float* pin_coords_y = my_placer->get_y_coords();
 
    // 4.2 Estimate Parasitics from placement
    heterosta_extract_rc_from_placement(sta, pin_coords_x, pin_coords_y,
                                        0.10, 0.10, 0.05, 0.05, 
                                        0.0, 8, 0.3, 0, false);
 
    // 4.3 Run STA
    heterosta_update_delay(sta, false);
    heterosta_update_arrivals(sta, false);
 
    // 4.4 Get Raw Slacks
    float (*slacks)[2] = (float (*)[2])malloc(num_pins * sizeof(float[2]));
    heterosta_report_slacks_at_max(sta, slacks, false);
 
    // 4.5 Placer uses the 'slacks' array to update its solution
    my_placer->update_placement_with_slacks(slacks);
    free(slacks);
}
 
// 5. Clean up
heterosta_free(sta);

Next Steps

For detailed information on every function, including all parameters and data structures, please refer to the complete API Reference document.