I want to create a very compact parallel to serial shift register.
I have manually designed a logic tile.
I want yosys/nextpnr to just do the routing between this tile and the io pins.
I have design the code to use yosys primitive, but nextpnr fails to fuse the LUTs with the Carrys.
Here is the code:
module top (
output PIN_21, PIN_22, PIN_23, PIN_24, USBPU,
input CLK, PIN_1, PIN_2, PIN_3, PIN_4, PIN_5, PIN_6, PIN_7, PIN_8, PIN_9, PIN_10, PIN_11, PIN_12, PIN_13
);
wire[12:0] loop;
wire[12:0] carry;
MyCell #(.LUT_INIT('h0F0F)) sRegBorder(loop[0], carry[0], 0, loop[0], PIN_13, 0, 0, CLK);
MyCell #(.LUT_INIT('hFAFA)) sRegA(loop[1], carry[1], loop[0], loop[1], PIN_13, PIN_1, carry[0], CLK);
MyCell #(.LUT_INIT('hFAFA)) sRegB(loop[2], carry[2], loop[1], loop[2], PIN_13, PIN_2, carry[1], CLK);
MyCell #(.LUT_INIT('hFAFA)) sRegC(loop[3], carry[3], loop[2], loop[3], PIN_13, PIN_3, carry[2], CLK);
MyCell #(.LUT_INIT('hFAFA)) sRegD(loop[4], carry[4], loop[3], loop[4], PIN_13, PIN_4, carry[3], CLK);
MyCell #(.LUT_INIT('hFAFA)) sRegE(PIN_24, carry[5], loop[4], PIN_24, PIN_13, PIN_5, carry[4], CLK);
SB_LUT4 #(.LUT_INIT('hFFFF)) sRegFin (PIN_22,0,0,0,carry[5]);
endmodule
module MyCell(output O, CO, input I0, I1, I2, I3, CI, CLK);
parameter [15:0] LUT_INIT = 0;
wire lo;
SB_LUT4 #(.LUT_INIT(LUT_INIT)) lut (lo, I0, I1, I2, I3);
SB_CARRY cr (CO, I1, I2, CI);
SB_DFF dff (O, CLK, lo);
endmodule
The expected result is to have just one tile with a stack of 7 LUTs.
* PIN_13 should be connected to I2 of the first 6 LUTS.
* PIN_[1-6] should be connected to I3 of the first 6 LUTS, respectivelly.
* every output of the first 6 LUTs should be buffered (DFF) and the buffered output should loop to the I1 of the same LUT.
* every output of the first 5 LUTs shoud also be routed to the I0 of the next LUT in sequence.
* the carry logic should be enabled and flow through the first 6 LUTs and at LUT7 should be captured as an output.
The result I got from yosys looks OK, but nextpnr butchers the LUTs allover the place and allocated separate LUTs for the carrys, doubling the number of LUTs used.
So basically, if I know the output that I want, at least down to a specific tile configuration, What should I write as input?
I try to compile the code on a TinyFPGA.BX.
I believe the answer is that what you try to do can currently not be done. I am sorry for this negative answer, I need tight packing too, so I hope someone can prove me wrong.
I did some investigations, downloaded the newest
yosys
andnextpnr
(andarachne-pnr
), and more or less copied your design. While ICECube2 gave the result I expected, neither yosys/arachne-pnr nor yosys/nextpnr-ice40 managed.This is a very vexing problem, which seems to reside in
nextpnr/ice40/pack.cc
around line 192. I am sorry I can't help, but perhaps this input can be used to improve packing innextpnr-ice40
.