5.2 Introduction to Verilog Clocks

❮ Verilog Defparam Linux Mysql Import Export Data ❯

5.2 Introduction to Verilog Clocks

Category Advanced Verilog Tutorial

Keywords: Clock source, clock skew, clock jitter, clock transition time, clock delay, clock tree, double-edge clock

Almost any slightly complex digital design relies on clocks. Clocks are also the foundation of all sequential logic. The concept of clock skew was mentioned earlier when discussing setup and hold times. Below, we will summarize the relevant knowledge about clocks to facilitate better digital design.

Clock Source

Depending on the location of the clock source within the digital design module, it can be classified as an external or internal clock source.

External Clock Source:

RC/LC Oscillation Circuit: Utilizes positive or negative feedback circuits to generate periodic clock signals. These clock sources are simple, have a wide frequency range, but operate at lower frequencies and have poor stability.

Passive/Active Crystal Oscillator: Uses the piezoelectric effect of quartz crystals (pressure and electrical signals can be converted into each other) to generate resonant signals. These clock sources have high frequency accuracy, good stability, low noise, and minimal temperature drift. In active crystal oscillators, voltage control or temperature compensation is often added, resulting in good phase and frequency characteristics. However, the circuit implementation is relatively complex, the frequency band is narrow, and the frequency is essentially non-adjustable.

During the debugging of specific circuits, some built-specific circuits (such as Schmitt triggers) or signal generator equipment may also be used to generate clock sources.

Internal Clock Source:

Phase-Locked Loop (PLL):

Due to process and cost constraints, ordinary crystal oscillators cannot achieve very high frequencies. Using a PLL circuit can achieve stable and high-frequency clocks. The PLL is integrated into the design module, ensuring that the digital circuit has good delay and stability.

Clock Division: Some modules operate at frequencies lower than the system clock frequency, so the system clock needs to be divided to obtain a lower frequency clock.

Counting within an always block and outputting the clock signal is a common method for frequency dividers. The implementation logic for arbitrary frequency division is detailed in the next section, "5.3 Clock Division."

Clock Switching: The operating frequency of the system or certain modules may change under specific conditions, such as frequency reduction in low-power mode or frequency increase to enhance computational capability. In such cases, the system often has multiple clock sources for clock switching when needed.

If the clock switching logic is not optimized, there is a high probability of spike interference during the transition period, which can adversely affect the circuit. Safe switching logic is detailed in the following chapter: "5.4 Clock Switching."

Digital systems often adopt a scheme where an external crystal oscillator input is multiplied by an internal PLL. Then, according to design requirements, clock division or clock switching is performed.

Clock Characteristics

During simulation, all synchronized clocks are ideal: the clock transition is instantaneous, the clock edges between modules are aligned, there is no delay, and no jitter. In actual circuits, there is always a delay when clocks are transmitted and transitioned. Perfect digital design should also consider these imperfect clock characteristics, otherwise, it can lead to timing violations.

Below is a brief explanation of some clock characteristics.

Clock Skew (Skew):

Due to network delays, the clock signal cannot ensure that the clock edges at different flip-flop ports are aligned, i.e., there is a phase difference between the clock signals at different flip-flop ports. This difference is called clock skew. The schematic is as follows:

Generally, clock skew is not directly related to clock frequency but is related to factors such as trace length, load capacitance, and load quantity.

Clock Jitter (Jitter):

The deviation from the ideal clock edge, which does not accumulate over time and fluctuates ahead or behind, is called clock jitter. It can be quantitatively described by jitter frequency and jitter amplitude. In digital design, clock jitter is always described in terms of time, as shown in the schematic below.

Clock jitter can be divided into random jitter and fixed jitter.

Sources of random jitter include thermal noise, semiconductor processes, etc.

Sources of fixed jitter include switching power supplies, electromagnetic interference, or improper layout and wiring.

In the synthesis tool Design Compiler, clock skew and jitter are uniformly represented by uncertainty.

Transition Time (Transition):

When the clock transitions from the rising edge to the falling edge, or from the falling edge to the rising edge, it does not "rise and fall" instantly but takes a transition time to complete the level transition. This transition time is called the clock's transition time, as shown in the schematic below.

The transition time is related to the cell library process and capacitive load.

Clock Delay (Latency):

The delay time from the clock source (such as a crystal oscillator, PLL, or frequency divider output) to the flip-flop port is called clock delay. Clock delay includes source delay (source latency) and network delay (network latency), as shown in the figure below.

Source delay is the transmission time of the clock signal from the actual clock origin to the clock definition point of the design module. As shown in the figure, it is 3ns.

Network delay is the transmission time from the clock definition point of the design module to the clock port of the flip-flop within the module, and the transmission path may pass through buffers. As shown in the figure, it is 1ns.

Source delay (source latency) is a delay common to all flip-flops within the design module, so it does not affect clock skew (skew).

Clock Tree

In digital design, various modules should use synchronous clock circuits, and flip-flops driven by the same clock signal together form a clock domain. In an ideal circuit, the clock signal would arrive simultaneously at all clock ports of the same clock domain. However, due to various delays in practice, this zero-delay clock characteristic is difficult to achieve. Moreover, the driving capability of the clock signal is limited and cannot independently provide effective fan-out for a clock domain with a large number of flip-flops. To solve the problems of clock delay and driving, a clock tree system needs to be used to manage the clock signal to ensure good timing and driving capability.

The clock tree is a mesh structure built with many balanced buffer cells. It is generally built from a clock source point through a series of buffer cells. The actual clock tree structure with added clock buffers (orange triangle modules in the figure) is shown below.

The clock tree does not reduce the time it takes for the clock signal to reach each flip-flop but reduces the time difference between the arrival times at each flip-flop. Typically, backend designers complete the design of the clock tree by inserting clock buffers. Frontend designers often need to ensure the correctness of the clock scheme and digital logic functionality.

Other clock classifications:

Synchronous, Asynchronous Clocks:

Detailed explanations can be found in "4.1 Synchronous and Asynchronous." When the clocks are sourced from the same origin and satisfy an integer multiple relationship, they are generally considered synchronous. The definition of synchronous clocks in digital design is quite broad. Logic under the same clock domain does not require synchronization processing.

Let's understand the concept of synchronous clocks from the perspective of synchronous circuits.

A synchronous circuit is a circuit composed of sequential and combinational logic circuits. The characteristic of a synchronous circuit is that the clock terminals of all flip-flops are connected together and connected to the system clock terminal. The state of the circuit can only change when the clock pulse arrives. The changed state will remain until the next clock pulse arrives. During this period, regardless of whether the external input x changes, each state in the state table is stable.

Gated Clock:

The basic principle of gated clocks: Enable the clock when the enable signal is active. Disable the clock when the enable signal is inactive.

Since gated clocks can turn off the working clock at the appropriate time, they are widely used in low-power design. The simplest implementation logic of gated clocks is to directly perform an "AND" operation on the enable signal and the clock signal, but this is unsafe and prone to glitches. For detailed gated clock introductions, please refer to "6.4 RTL-Level Low-Power Design (Part 2)."

Double-Edge Clock:

Some modules can transmit data on both the rising and falling edges of the clock, achieving double the rate.

DDR (Double Data Rate) SDRAM is a typical example of using double-edge data transmission.

A typical DDR data transmission schematic is shown below.

Below is a simple simulation of the behavior of double-edge clock data transmission.

The basic design idea is to use the double-edge clock to read data and then select and output the data through the inverted chip select signal to complete the data transmission on both edges of the clock.

The Verilog code description is as follows.

Example

module double_rate(
    input               rstn,
    input               clk,
    input               csn,

    input [7:0]         din,
    input               din_en,
    output [7:0]        dout,
    output              dout_en);

   //capture at posedge
   reg [7:0]            datap_r;
   reg                  datap_en_r;
   always @(posedge clk or negedge rstn) begin
      if (!rstn) begin
         datap_r        <= 'b0;
         datap_en_r     <= 1'b0;
      end
      else if (din_en) begin
         datap_r        <= din;
         datap_en_r     <= 1'b1;
      end
      else begin
         datap_en_r     <= 1'b0;
      end
   end

   //capture at negedge
   reg [7:0]            datan_r;
   reg                  datan_en_r;
   always @(negedge clk or negedge rstn) begin
      if (!rstn) begin
         datan_r        <= 'b0;
         datan_en_r     <= 1'b0;
      end
      else if (din_en) begin
         datan_r        <= din;
         datan_en_r     <= 1'b1;
      end
      else begin
         datan_en_r     <= 1'b0;
      end
   end

   assign dout = !csn ? datap_r : datan_r;
   assign dout_en = datan_en_r | datap_en_r;
endmodule

The testbench description is as follows, where the double-edge data transmission module's clock frequency is 100MHz, but the input data rate is 200MHz.

Example

`timescale 1ns/1ps

module test;
   reg         clk_100mhz, clk_200mhz;
   reg         rstn;
   reg         csn;
   reg [7:0]   din;
   reg         din_en;
   wire [7:0]  dout;
   wire        dout_en;

   always #(2.5)    clk_200mhz  = ~clk_200mhz;
   always @(posedge clk_200mhz)
                   clk_100mhz  = ~clk_100mhz;

   initial begin
      clk_100mhz  = 0;
      clk_200mhz  = 0;
      rstn        = 0;
      din         = 0;
      din_en      = 0;
      csn         = 0;
      //start work
      #11 rstn    = 1;
      @(negedge clk_100mhz);
      din_en      = 1;
      #0.2;
      csn         = 1; //csn=1 outputs data captured at falling edge
      //generate csn
      forever begin
         @(posedge clk_100mhz);
         #0.2;        //add a slight delay to ensure correct data capture
         csn = 0;     //csn=0 outputs data captured at rising edge
         @(negedge clk_100mhz);
         #0.2;
      end
   end
endmodule

csn = 1; // When csn=1, output data collected on the falling edge
end
end

always @(negedge clk_200mhz) begin
   din <= {$random()} % 8'hFF; // Generate random data for transmission
end

double_rate u_double_rate(
   .rstn      (rstn),
   .clk       (clk_100mhz),
   .csn       (csn),
   .din       (din),
   .din_en    (din_en),
   .dout      (dout),
   .dout_en   (dout_en));

initial begin
   forever begin
      #100;
      if ($time >= 10000) $finish;
   end
end

endmodule // test

The simulation results for the first few data points are as follows.

As shown in the figure, the data transmission is normal, and the rate is twice the clock frequency.

This simulation is only a simple demonstration of data transmission on both clock edges, not a simulation of DDR's working principle. The working principle of DDR double-rate data transmission is much more complex than this simulation.

However, in general, using dual-edge clock logic is not recommended for several reasons.

In an always block, you cannot use both rising and falling edges in the sensitivity list, nor can you assign a value to the same variable in two always blocks. For example, the following description is incorrect. Although the RTL compiler may not report an error, it cannot be synthesized into actual circuitry. This complicates signal communication.

always @(posedge clk or negedge clk) begin

The data transmission rate is twice the data clock frequency. If you use rising and falling edge logic for RTL modeling, you also need to flip the chip select signal at the same rate as the clock; if you do not use the chip select signal, the module should introduce a clock signal that is twice the frequency of the data clock to properly select data.

After using dual-edge clock logic, constraints for both rising and falling edges need to be properly defined. Clock constraints become more complex, and routing requirements are stricter, increasing debugging difficulty.

Designing with dual-edge clock logic requires high-quality clocks, and the clock tree design must consider many factors.