Easy Tutorial
❮ Android Tutorial Toast Verilog Serial Fir ❯

7.1 Verilog Divider Design

Category Verilog Tutorial

Divider Principle (Fixed Point)

Similar to decimal division, the process of calculating 27 divided by 5 is shown as follows:

The division operation process is as follows:

It should be noted that the bit width of the quotient should be consistent with the dividend, because the divisor may be 1. So in the above manual division example, when comparing in the first step, the highest bit 1 (3'b001) of the number 27 should be compared with 3'b101. Based on this calculation process, a pipeline divider with configurable bit width is designed, and the number of pipeline delay cycles is consistent with the bit width of the dividend.

Divider Design

Single-Step Operation Design

When performing single-step division, the bit width of the single-step dividend (signal dividend) needs to be 1 bit more than the bit width of the original divisor (signal divisor) to avoid overflow.

To facilitate pipelining, registers are needed at the output end to store the original divisor (signals divisor and divisor_kp) and dividend information (signals dividend_ci and dividend_kp).

The result of the single-step operation is to obtain a new 1-bit quotient data (signal merchant) and remainder (signal remainder).

To obtain the final division result, the new 1-bit quotient data (signal merchant) also needs to be shifted and accumulated with the quotient result of the previous cycle (merchant_ci).

The design of the single-step operation unit is as follows (file name divider_cell.v):

Example

// parameter M means the actual width of divisor
module divider_cell
  #(parameter N=5,
    parameter M=3)
  (
   input                     clk,
   input                     rstn,
   input                     en,

   input [M:0]               dividend,
   input [M-1:0]             divisor,
   input [N-M:0]             merchant_ci , // Quotient from the previous stage
   input [N-M-1:0]           dividend_ci , // Original divisor

   output reg [N-M-1:0]      dividend_kp,  // Original dividend information
   output reg [M-1:0]        divisor_kp,   // Original divisor information
   output reg                rdy ,
   output reg [N-M:0]        merchant ,  // Quotient output of the computation unit
   output reg [M-1:0]        remainder   // Remainder output of the computation unit
  );

  always @(posedge clk or negedge rstn) begin
    if (!rstn) begin
      rdy            <= 'b0 ;
      merchant       <= 'b0 ;
      remainder      <= 'b0 ;
      divisor_kp     <= 'b0 ;
      dividend_kp    <= 'b0 ;
    end
    else if (en) begin
      rdy            <= 1'b1 ;
      divisor_kp     <= divisor ;  // Original divisor remains unchanged
      dividend_kp    <= dividend_ci ;  // Original dividend is passed
      if (dividend >= {1'b0, divisor}) begin
        merchant    <= (merchant_ci<&lt;1) + 1'b1 ; // Quotient is 1
        remainder   <= dividend - {1'b0, divisor} ; // Calculate the remainder
      end
      else begin
        merchant    <= merchant_ci<&lt;1 ;  // Quotient is 0
        remainder   <= dividend ;        // Remainder remains the same
      end
    end // if (en)
    else begin
      rdy            <= 'b0 ;
      merchant       <= 'b0 ;
      remainder      <= 'b0 ;
      divisor_kp
English:

The remainder is assigned to the variable `remainder_t[N_ACT-M-i]` within the block `sqrt_stepx`, and the process ends with the termination of this block.

The signals `res_rdy`, `merchant`, and `remainder` are assigned the values from the first element of their respective arrays (`rdy_t[0]`, `merchant_t[0]`, and `remainder_t[0]`), which represent the final quotient and remainder of the division operation.

The module ends with the keyword `endmodule`.

---

**testbench**

The testbench is designed with a dividend bit width of 5 and a divisor bit width of 3, incorporating self-verification as described below:

## Example

`timescale 1ns/1ns

module test; parameter N = 5; parameter M = 3; reg clk; reg rstn; reg data_rdy; reg [N-1:0] dividend; reg [M-1:0] divisor;

wire res_rdy;
wire [N-1:0] merchant;
wire [M-1:0] remainder;

// Clock generation
always begin
    clk = 0; #5;
    clk = 1; #5;
end

// Driver
initial begin
    rstn = 1'b0;
    #8;
    rstn = 1'b1;

    #55;
    @(negedge clk);
    data_rdy = 1'b1;
    dividend = 25; divisor = 5;
    #10; dividend = 16; divisor = 3;
    #10; dividend = 10; divisor = 4;
    #10; dividend = 15; divisor = 1;
    repeat(32) #10 dividend = dividend + 1;
    divisor = 7;
    repeat(32) #10 dividend = dividend + 1;
    divisor = 5;
    repeat(32) #10 dividend = dividend + 1;
    divisor = 4;
    repeat(32) #10 dividend = dividend + 1;
    divisor = 6;
    repeat(32) #10 dividend = dividend + 1;
end

// Delay input for self-verification
reg [N-1:0] dividend_ref[N-1:0];
reg [M-1:0] divisor_ref[N-1:0];
always @(posedge clk) begin
    dividend_ref[0] <= dividend;
    divisor_ref[0] <= divisor;
end

genvar i;
generate
    for(i=1; i<=N-1; i=i+1) begin
        always @(posedge clk) begin
            dividend_ref[i] <= dividend_ref[i-1];
            divisor_ref[i] <= divisor_ref[i-1];
        end
    end
endgenerate

// Self-verification
reg error_flag;
always @(posedge clk) begin
# 1;
    if (merchant * divisor_ref[N-1] + remainder != dividend_ref[N-1] && res_rdy) begin
        // In the testbench, multiplication can be directly used without considering the operation cycle
        error_flag <= 1'b1;
    end
    else begin
        error_flag <= 1'b0;
    end
end

// Module instantiation
divider_man #(.N(N), .M(M))
u_divider(
    .clk(clk),
    .rstn(rstn),
    .data_rdy(data_rdy),
    .dividend(dividend),
    .divisor(divisor),
    .res_rdy(res_rdy),
    .merchant(merchant),
    .remainder(remainder)
);

// Simulation finish
initial begin
    forever begin
        #100;
        if ($time >= 10000) $finish;
    end
end

endmodule // test ```

Simulation Results

As shown in the diagram, the two input data, after being delayed by a number of cycles equal to the bit width of the dividend, output the correct division result. Moreover, it can output without delay in a pipelined manner, which meets the design requirements.

Download Source Code

❮ Android Tutorial Toast Verilog Serial Fir ❯