Abstract: This invention provides a method and an improved FPGA apparatus for enabling the selective deployment of unused flip-flops or other circuit elements in IO cells and unused decoders or other circuit elements in Look Up Tables (LUT), for core logic functions, comprising disconnecting means for selectively disconnecting unused circuit elements from the IO pad circuitry or from said LUT circuitry, and connecting means for selectively connecting said disconnected circuit elements either to the connection matrix of the core logic or between themselves to provide independently configured functions.
UTILIZATION OF UNUSED IO BLOCK FOR CORE LOGIC FUNCTIONS
Field of the invention
This invention relates to a system and method for enabling the utilization of unused IO Block and Look Up Table (LUT) circuitry for core logic functions or independent logic functions.
Background of Invention
In many FPGA applications it is required to provide the option for registering the incoming and outgoing signals to and from the FPGA. For this purpose IO cells are usually designed to include flip-flops. A flip-flop is provided with the output buffer to register the signal coming from the core, before going to the IO pad and with the input buffer to register the signal coming from the pad, before going to the core. Sometimes the tri-stating signal of the output buffer is also provided with a flip-flop for synchronization. These IO Blocks (IOB) include the option to use these flip-flops or to bypass them depending upon the type of application. (Ref. Xilinx 's data book of year 1999, Virtex device IOB on page 3-6).
In applications where registered inputs-outputs are not required, there is direct signaling between IO pads and core and the flip-flops are left unutilized. It is also possible that some of lOs of the FPGA device are not used, in this case also flip-flops associated with these lOs are not utilized. With minimal addition of hardware these flip-flops can be utilized for some other purpose thereby reducing the load on internal core Logic cells
US patent 5,869,982 describes an apparatus and method for interconnecting adjacent unused IO pad circuitry to provide independent logic function. This invention does not however, provide for the connection of such unused circuitry to the core logic, nor does it utilize the unused circuit elements of the Look Up Tables.
The object and summary of the invention
The object of this invention is to provide an apparatus and method for enabling the utilization of unused IO pad and LUT circuitry for core logic functions or for implementing independent logic functions.
To achieve the said objectives this invention provides in an FPGA apparatus, an improvement for enabling the selective utilization of unused flip-flops or other circuit elements in IO cells
and unused decoders or other circuit elements in Look Up Tables (LUT), for core logic
functions, comprising:
disconnecting means for selectively disconnecting unused circuit elements from the IO pad circuitry or from said LUT circuitry, and connecting means for selectively connecting said disconnected circuit elements either to the connection matrix of the core logic or between themselves to provide independently configured functions.
The said disconnecting means is Configuration Logic circuitry provided between the internal core logic and IO pad interface circuits or LUTs.
The said connecting means is a routing matrix between internal core logic and said 10 pad circuitry or LUT circuitry.
The said unused IO pad flip-flops are configured as serial-to-parallel or parallel-to-serial data converters.
*.
The said unused LUT circuit elements are deployed to implement configurable two or four input logic functions.
The said logic function is a multiplexer function.
The above FPGA apparatus includes grouping of said 10 pads for enabling configurable complex logic functions.
The present invention further provides a method for enabling the utilization of unused flip-flops or other unused circuit elements in IO cells and unused decoders or other circuit elements in Look Up Tables (LUTs) of an FPGA for core logic functions, comprising the steps of:
disconnecting said unused circuit elements from said IO circuitry and / or
LUT, and
connecting said disconnected circuit elements to the connection matrix of the
core logic or amongst themselves to provide independent functions.
The said disconnecting is done by Output Configuration Logic circuitry provided between the core logic and 10 pad interface (IOL) circuits or LUT.
The connecting is done by a routing matrix between interval core logic and said IO pad circuitry or LUT circuitry.
The said method is used for configuring said unused IO pads flip-flops as parallel-to-serial or serial-to-parallel data converter.
The said unused LUT circuit elements are deployed to implement configurable two or four input logic functions.
The said logic function is a multiplexer function
The above method includes grouping of said IO pads for enabling configurable complex logic functions.
Brief description of the drawings
The invention will now be described with reference to the accompanying drawings.
Fig. 1 shows the top level structure of an FPGA according to this invention.
Fig. 2 shows the block diagram of an IO Group IOG.
Fig. 3a) shows the internal structure of an IOG.
Fig. 3b) shows the detailed of the flip-flops in each IOG.
Fig. 4 shows the internal structure of a IO Logic Block (IOL)
Fig. 5 shows the structure of the switch boxes inside the IOL.
Fig. 6 shows the interconnection of 4 lOLs.
Fig. 7 shows another embodiment of the invention relating to unused LUT decoders.
Detailed description of the invention
This invention describes an improved FPGA in which, each IO pad has associated with it an IOI (input output interface) and an IOL (input output logic). lOIs comprise of input and output buffers for interfacing with the external world and lOLs have flip-flops and muxes for providing registered, latched, unregistered and other logical options to IO signals. Each IOL has four flip-flops, one for input data, one for output data and two for output buffer tri-state signals. Hence the IOL forms the link between the IOI and the core. lOLs of four consecutive pads are grouped to form an IOLG (input output logic group) and the corresponding four lOIs are grouped to form an IOIG (input output interface group). Thus, as each IOL has 4 flip-flops, each IOLG will have 4*4 = 16 flip-flops. An IOLG and a corresponding IOIG are grouped to form an IOG(input output group). So each IOG groups four IO pads, four lOIs and four lOLs. A 4-input LUT decoder circuitry is also associated with each IOLG.
Apart from the normal use of flip-flops to register input/output signals in an IOG, if not used for this purpose these flip-flops can be used for a 4-input LUT, or Serial-to-Parallel and Parallel-to Serial data converter. In the case when all the four lOs in an IOG are used in direct mode or are unused, the unused 16 flip-flops in an IOLG can be configured as transparent latches and along with an LUT decoder, can be used as a 4-input LUT for logic implementation. This will reduce the load on internal core Logic cells. This four input LUT can have its inputs either from the routing matrix or directly from the four input buffers of the same IOG to which it is associated. The output of the LUT can also be configured to go to the routing matrix or directly to any one of the output buffer of the same IOG. This option to connect input/output of LUT directly to the IO pads reduces the delays, which would otherwise be going via routing matrix to the internal logic cells for the same purpose.
In the case when four or less than four of the lOs of an IOG are used in direct mode or are unused, the unused flip-flops of the IOLG can be used as Serial-to-parallel or Parallel-to Serial data converter. If flip-flops of only one IOL in an IOLG are free, then a 4-bit Serial-to-parallel or Parallel-to Serial data converter can be implemented using these four flip-flops. Similarly if two lOLs of an IOLG have there flip-flops free, than 8-bit Serial-to-parallel or Parallel-to Serial data converter can be implemented, if only three lOLs are free then 12-bit
and if all the four lOLs are free then 16-bit converters can be implemented. Data converters
5
higher than 16-bit can be implemented using iOLs of adjacent IOG. Serial-to-Parallel or Parallel-to Serial data converters can be core-to-core (i.e. serial input coming from core and parallel output going back to core or parallel input coming from core and serial output going back to core.), pad-to-core (input from pad and output going to core) or core-to-pad (input from core and output going to pad). In case of pad-to-core only serial-to-parallel data converter (i.e. serial input from the pad and parallel output to the core) are possible and similarly in the case of core-to-pad only parallel-to-serial data converter (i.e. parallel input from the core and serial output to the pad) are possible. Other conversion operations (pad-to-core parallel-to-senal conversion, core-to-pad serial-to-parallel conversion and pad-to-pad conversions) cannot be done independently in an 1OLG, as the data has to be routed via routing matrix to complete the conversion. These operations depend on the architecture of the routing matrix.
Detailed Description of the preferred embodiment of the Invention
Figure 1 gives the top level of hierarchy for the proposed architecture. JOs in the lOring of an FPGA are grouped into tOGs (Input Output Groups). Each IOG gioups four lOs. GENERAL ROUTING is configurable routing nru'trix to provide flexible interface between 1OG-CORE and IOG-IOG.
Figure 2 shows block diagram of IOG and its interface with routing matrix GENERAL ROUTING and adjacent lOGs. In the figure three lOGs IOG1, IOG2 & IOG3 are shown. As all lOGs are exactly same so to explain let us take IOG1. Four IO Pads PI, P2, P3, P4, an IOIG (10 Interface Group), an IOLG (IO Logic Group) and a LUT DECODER corresponds to form IOG1. Routing resources are provided to interface between different blocks of an IOG. IO pads PI, P2, P3, P4 are directly connected to IOIG. Route R2 and R4 are used to interface between IOIG and IOLG . Interfacing between IOLG and GENERAL ROUTING is done through routing Rl and R3. L-OUT is the routing line which takes output of LUT to GENERAL ROUTING and a taping from L-OUT also goes to IOIG to provide direct LUT output at any one of the IO pads PI, P2, P3, P4. R6 routing is a 4-bit bus coming from GENERAL ROUTING and R7 routing is a 4-bit bus coming from IOIG. R6 and R7 goes to 4-bit bus multiplexer BM whose outputs act as select lines for LUT decoder (i.e. 4 input lines to the LUT). So 4 inputs to the LUT can come from the GENERAL ROUTING or directly from IO pads PI, P2, P3, P4. Route R8 is to interface between two adjacent lOLGs
Figure 3(a) shows a single IOG. As each IOG being exactly same, so to explain let us continue with IOG1. Referring to figure 2, IOG1 comprises of 4 IO pads PI, P2, P3, P4, an IOIG (IO Interface Group), an IOLG (IO Logic Group) and a 4-bit LUT-DECODER circuitry. IOIG is a group of 4 lOIs IOI1, IOI2, IOI3, IOI4 each connected to its respective pad, namely PI, P2, P3, P4.
Each IOI is also linked with an IOL. So there are four lOLs IOL1, IOL2, IOL3 and IOL4,
one for each respective IOI. These four lOLs are grouped to form IOLG. Each IOI has an input buffer BUFIN and an output buffer BUFFOUT. Input buffer BUFIN receives the signal from the IO pad and gives its output to IOL via line LI. Output buffer BUFOUT gets its input from IOL via line LO and its output goes to IO pad. BUFOUT can be configured as open-drain type, open-source type, push-pull type or can be permanently tri-stated using signals LTp and LTn. Signal LTp is to tri-state pull-up transistors and LTn is to tri-state pull-down transistors of the output buffer BUFOUT.
Each IOL comprises of four flip-flops and multiplexers (not shown in this figure). Flip-flops are used to provide the register options to the 10 signals. These flip-flops are named FF-I/P, FF-T/Sp, FF-T/Sn and FF-O/ P, one for each of the signal linked with corresponding IOI. For example flip-flop FF-I/P is for input signal LI, flip-flop FF-T/Sp for pull-up tri-stating signal LTp, flip-flop FF-T/Sn for pull-down tri-stating signal LTn and flip-flop FF-O/P for output signal LO.
As an IOLG has 4 lOLs and each IOL has 4 flip-flops, so we get 4*4=16 flip-flops in each IOLG. All the 16 flip-flops in an IOLG have two clock inputs, one from line CLK1 and other from line CLK2. Line CLK1 gets the clock from pin CLK through a nand gate Gl and line CLK2 also gets the clock from pin CLK but through a nor gate G2. The other input of nand gate Gl is connected to configuration bit CB1 and other input of nor gate G2 is connected to CB1~ (inverse of CB1). For CB1 equal to zero, all flip-flops behave as transparent latches and for CB1 equal to one, they work as flip-flops, getting clock from pin CLK. The need and connectivity of two clock lines CLK1 and CLK2 within a flip-flop is explained in Figure 3(b).
An IOLG also includes a LUT-DECODER. This LUT-DECODER is simply a 16 is to 1 multiplexer and along with 16 flip-flops of IOLG forms a 4-input LUT. When used in an LUT, all the 16 flip-flops are loaded with required data for logic implementation and their outputs goes to the LUT-DECODER (not shown in this figure). The four inputs to the LUT can be either from the GENERAL ROUTING via route R6 or directly from four IO pads PI, P2, P3, P4 of parent IOG via route R7. Route R6 or route R7 can be selected as 4-inputs to the LUT using bus multiplexer BM. Similarly output of the LUT can go to the GENERAL ROUTING via route L-OUT or to any one of the IO pads PI, P2, P3, P4 of parent IOG (shown properly in figure 4).
Figure 3(b) shows the schematic of flip-flops used. The schematic is exactly same as any other flip-flop normally used. The only difference is two clock input pins CLK1 and CLK2. CLK1 is buffered through two inverters and connected to the gates of pass transistors NI and N4. CLK2 is first inverted to get CLK2- and then CLK2- is connected to the gates of pass transistors N2 and N3. Pin IN is the input of the flip-flop and pin OUT1 and OUT2 are two outputs of the flip-flop. OUT2 is the normal output of the flip-flop (registered output) and OUT1 is a taping to get latched output.
It can be seen that when CB1 (figure 3(a)) is 'zero', both nets CLK1 and CLK2- have 'one'. This makes all the pass transistors Nl, N2, N3, N4 ON making the flip-flop a simple latch. When CB1 is 'one' net CLK1 is driven by clock from pin CLK and net CLK2- is driven by inverse of clock from pin CLK, making the flip-flop to operate normally.
It should be noted that figure shows schematic of a simple flip-flop and it can be modified accordingly to add set, reset or other features.
Figure 4 shows the detailed structure of a single 1OL and its interface with IOI, GENERAL ROUTING and adjacent lOLs. In the figure IOL2 is referred for explanation.
Lines LI, LTp, LTn and LO interface IOL2 with IOI2. Lines LO to L7 and L-OUT interface IOL2 to GENERAL ROUTING.(note that IOL1 and IOL3 also have lines and devices with names common to IOL2, this is because all the lOLs have exactly the same structure). Line L-OUT is the output of the LUT-DECODER, which goes to the GENERAL ROUTING. A taping from line L-OUT also goes to mux M13. This allows
LUT's output to be configured to go to the 1O pad. Each of the four flip-flops FF-I/P, FF-T/ Sp, FF-T/Sn and FF-O/P in IOL2 has one input and two outputs, one of which is to get latched output and other is for flip-flopped output. The input to the flip-flop FF-I/P can be configured through mux MO to come from either of the lines NO, LI or write. Similarly input to the flip-flop FF-T/Sp can be configured through mux Ml to come from either of the lines Nl, L5 or write, for flip-flop FF-T/ Sn through mux M2 to come from either of the lines N2, L6 or write and for flip-flop FF-O/P through mux M3 to come from either of the lines N3, L7 or write. Both the outputs of the flip-flops FF-I/P, FF-T/Sp, FF-T/Sn and FF-O/P goes to the muxs M10, Mil, M12 and M13 respectively. Mux M10 selects outputs of flip-flop FF-l/P or line LI (LI is signal line form pad through input buffer) to connect to line L4. Mux Mil selects outputs of flip-flop FF-T/Sp, configuration bit CB2 or line L5 to connect to line LTp (LTp is pull-up tri-state signal). Similarly mux M12 selects outputs of flip-flop FF-T/Sn, configuration bit CB3 or line L6 to connect to line LTn (LTn is pull-down tri-state signal) and mux M13 selects outputs of flip-flop FF-O/P, line L-OUT or line L7 to connect to line LO (LO is signal line going to pad).
A taping from flip-flopped output of the flip-flops FF-I/P, FF-T/Sp, FF-T/Sn and FF-O/P goes to the configurable switch boxes SI, S2, S3 and SO respectively. Note that switch box SO of IOL2 is connected to the output of flip-flop FF-O/P of IOL1 and output of flip-flop FF-O/P of IOL2 goes to switch box SO of IOL3. Switch box SI can be configured to connect line Nl to line LI and/or line Nl to output of flip-flop FF-I/P. Similarly other Switch boxes SO, S2, S3 can be configured.
A taping L-IN from flip-flopped output of all the flip-flops also goes to the LUT-DECODER
Structure of all lOLs is exactly same as that of IOL2.
Figure 5 shows the structure of switch boxes SO-S3. Each switch box has two NMOSs used as switches SWITCH A and SWITCH B. SWITCH A defines the connectivity of line L to line N and switch SWITCH B defines the connectivity of line FF (flip-flop output) to line N.
SWITCH A is controlled through a nand gate G3. It can be configured by configuration bit CB4, to be permanently ON or controlled by a dynamic control signal DYN. Signal DYN can
be generated within the core. Similarly SWITCH B is controlled through cascaded or-and gates G4, G5 and G6. It can be configured by same configuration bit CB4 and configuration bit CBS, to be permanently ON, permanently OFF or controlled by a dynamic control signal DYN
Table below is the truth table of states of both the switches with respect to configuration bits CB4 and CBS status.
(Table Removed)
Figure 6 shows the connectivity of four lOLs namely IOL1, IOL2, IOL3 and IOL4 in an IOLG. For each IOL only the circuitry which is needed to explain its interfacing with the neighboring lOLs is shown. Route R8 in IOL1 connects it to IOL4 of the previous IOLG and route R8 in IOL4 connects it to IOL1 of the next IOLG. Route R8 in IOL1 is linked with switch box SO of IOL1. This switchbox can be configured in a way so as to take the signal R8 to FF-I/P through mux MO. Output of flip-flop FF-O/P of IOL1 goes to one of the input of SO of IOL2. In the similar way all the lOLs in an IOLG can be interfaced with their neighboring lOLs. Also first IOL of an IOLG can be connected to last IOL of its previous IOLG and last IOL of an IOLG can be connected to first IOL of its next IOLG.
Figure 7 shows another embodiment of the present invention. The only additions in this embodiment are 16 two-input LM muxes. Now the LUT-DECODER has its input coming from these 16 muxes instead of directly form flip-flops' output. One of the input to the LM muxes comes from flip-flopped output of flip-flops and other comes from the lines LO-L3. So LM muxes provides the option to select the 16 inputs to the LUT-DECODER to either come directly from core or from flip-flops. This option makes the LUT-DECODER to be used as 16-input multiplexer when not used for LUT.
DESCRIPTION OF OPERATING MODES
The operation of the above-preferred embodiment of the invention will now be described for various modes of operation.
Following text describes the configuration of IOG for various modes of operation.
A. NORMAL OPERATION MODE
lOLs can be configured, independent of each other, for normal operation. In normal operation mode, IOL can be configured to provide direct, registered or latched input data from the input buffer to the core and also to provide direct, registered or latched output data and Instate signals from the core to the output buffer.
Referring to figure 4, to provide direct input to the core, line LI coming from the input buffer BUFIN is selected by mux M10 to connect to the line L4 (BUFIN and BUFOUT are defined in figure 3a). To provide registered or latched input to the core line LI is selected by mux MO as input to the flip-flop FF-I/P. One of the two outputs of this flip-flop can be selected by mux M10 to connect to the line L4, depending upon the type of input required, registered or latched.
Similarly, to provide data coming from the core directly to the output buffer BUFOUT, line L7 coming form the core is selected by mux M13 to connect to the line LO. To register and latch the core data, line L7 is selected by mux M3 as input to flip-flop FF-O/P. One of the two outputs of this flip-flop can be selected by mux M13 to connect to the line LO, depending upon the type of output required, registered or latched.Tri-state signals to the output buffer BUFOUT can also be configured as direct, registered, latched or permanent (permanent is when coming from configuration bit). The configuration bits CB2 and CB3 can be connected to tri-state signals LTp and LTn through mux Mil and M12 respectively. Through this option the output buffer can be kept permanently enabled or tri-stated depending upon the configuration bits CB2 and CB3. For open drain output, only the pull-up transistor is permanently tri-stated through line LTp and configuration bit CB2. Alternatively, only the pull-down transistor is permanently tri-stated through line LTn and configuration bit CBS. Both the tri-state lines LTp and LTn can also be configured
independently to have direct, registered or latched signals. To provide a signal coming from
the core directly to the pull-up tri-state line LTp, line L5 is selected by mux Mil to connect to line LTp.To register and latch the signal, line L5 is selected by mux Ml as input to flip-flop FF-T/Sp. One of the two outputs of this flip-flop can be selected by mux Mil to connect to the line LTp, depending upon the type of pull-up tri-state signal required, registered or latched. Similarly pull-down tri-state line LTn can configured for these options by muxes Ml2 and M2, line L6 and flip-flop FF-T/Sn.
In summary, during Normal Operation, the core can get direct, registered or latched input data from the pad. A pad can have direct, registered or latched output data from the core. An output buffer can be configured to be permanently enabled, permanently tri-stated, dynamically tri-state controlled by core, pull-up open drain or pull-down open drain. The output buffer tri-state signals from the core can also be direct, registered or latched.
When input, output and tri-state signals of an IOL use direct signaling, then the four unused flip-flops can be used in other modes of operation.
B. DATA CONVERSION MODE
In this mode the unused flip-flops of lOLs can be configured for parallel to serial or serial to parallel data conversion operations. A single IOL can be used as 4-bit data converter. For higher widths two or more lOLs can be cascaded. There are various ways of data conversion depending upon the requirement, as described below.
Bl) Parallel To Serial Data Conversion
In this mode parallel data is converted into serial data using flip-flops in IOL of lOGs. The data converter can be of any width.
The different options in this mode are described below using the example of a 4*1 bit parallel to serial data converter:
B.1.1) Core-to-Core: In this mode parallel data coming from the core is loaded into the flip-flops and then shifted serially to give serial output, which goes back to the core.
Referring to an IOL structure shown in figure 4, to operate in this mode SWITCH A and
SWITCH B of switch boxes S1-S3 in a given IOL are configured to be dynamically
controlled by signal DYN.(Structure of switch boxes is defined in figure 5). Switch box SO of the same IOL has its SWITCH A permanently ON and SWITCH B permanently OFF and switch box SO of the next IOL (i.e. IOL3 is next IOL to IOL2) has its SWITCH A permanently ON and SWITCH B also permanently ON.
In the beginning of this mode SWITCH A of switch box SO is permanently ON and of switch boxes SI, S2 and S3 are kept in the ON state by signal DYN, thus connecting LO to NO, LI to Nl, L2 to N2 and L3 to N3. Nets NO, Nl, N2 and N3 are selected by multiplexers MO, Ml, M2 and M3 respectively and fed to flip-flops FF-I/P, FF-T/Sp, FF-T/Sn and FF-O/P respectively, which loads the 4 bit data form the core parallely at the rising edge of the clock pulse (i.e. if the flip-flop is a positive edge triggered). After the parallel data is loaded, SWITCH A of switchboxes SI, S2 and S3 goes into OFF state and SWITCH B of switchboxes SI, S2 and S3 goes into ON state. As SWITCH A and SWITCH B of switchbox SO of next IOL are permanently ON, so with every clock edge the parallel data shifts serially through the flip-flops, getting serial output from LO of next IOL in four clock pulses.
B.I.2) Core-to-PAD: This mode is similar to the core-to-core mode, the only difference being the output destination, which in this mode will go to the IO pad. This is realized by programming mux M13 such that it selects the registered output of FF-O/P. Thus the serial output goes to line LO which goes to output buffer and then finally reaches PAD. Here SO of next 1OL is not used and its SWITCH B is kept permanently OFF.
The serial data output going to the pad in this mode can also be configured to go to the core depending upon the configuration of switch box SO of next IOL. SWITCH A and SWITCH B of SO are configured permanently ON and mux Ml3 is configured to select the registered output of FF-O/P for simultaneous serial data output to IO pad and core.
The advantage of parallel to serial core-to-pad data conversion mode is that it can be used to overcome the shortage of IO pins resources. It can reduce the number of output pins required by first converting the parallel data into serial data and sending it to only one of the output
pins.
B.1.3) PAD-to-PAD: In this mode the 4-bit parallel input data comes from external world using IO pins in the input mode, gets converted into serial data and the output is taken from a single output pin.
Referring to figure 2, four IO pads of IOG1 are configured to take 4-bit parallel input data, (for this configuration of IOG1 refer to figure 4, line LI of all the four lOLs is selected to connect line L4). The 4-bit input data goes to the GENERAL ROUTING via routes Rl. After reaching GENERAL ROUTING, the further operation becomes similar to the core-to-pad parallel to serial data converter. Any one of the lOLs of IOG2 or of any other IOG can be configured for data conversion.
B.1.4) PAD-to-Core: In this mode the 4-bit parallel input data comes from external world using IO pins in the input mode, gets converted into serial data and the output goes to the core.
The operation of this mode is similar to the pad-to-pad case. Four-bit parallel input data is taken from four 10 pads of IOG1 and goes to the GENERAL ROUTING via routes Rl. Further operation is similar to the core-to-core parallel to serial data converter process.
B.2) Serial To Parallel Data Conversion
In this mode serial data is converted into parallel data using flip-flops in IOL of lOGs. The data converter can be of any width.
The different options in this mode are described below using the example of a 4*1 bit serial to parallel data converter.
B.2.1) Core-to-Core: In this mode serial data coming from the core is loaded into the flip-flops and then taken out simultaneously to get parallel output, which goes back to the core.
Again referring to an IOL structure shown in figure 4, to operate in this mode SWITCH A of switch boxes S1-S3 in a given IOL are configured to be dynamically controlled by signal DYN.(Structure of switch boxes is defined in figure 5). SWITCH B of switch boxes S1-S3 are kept permanently ON. Switch box SO of the same IOL has its SWITCH A permanently
ON and SWITCH B permanently OFF and switch box SO of the next IOL (i.e. IOL3 is next 1OL to IOL2) has its SWITCH A dynamically controlled by signal DYN and SWITCH B.
On the commencement of this mode SWITCH A of switchbox SO is in permanently ON state and SWITCH B of switchbox SO is in permanently OFF state. SWITCH A of switchboxes SI, S2 & S3 are kept in OFF state by signal DYN and SWITCH B of switchboxes SI, S2 & S3 are permanently ON. Muxes MO, Ml, M2 and M3 are programmed to select data on nets NO, Nl, N2 and N3. The serial data coming from core through net LO passes through SO and MO to reach FF-I/P. Output of FF-I/P is fed to FF-t/Sp through SI and Ml. Output of FF-T/Sp is fed to FF-T/Sn through S2 and M2. Output of FF-T/Sn is fed to FF-O/P through S3 and M3. And output of FF-O/P is fed to switchbox SO of next IOL (IOL3 in case of IOL2). Thus the serial data coming from the core is loaded serially into flip-flops FFI/P, FFT/Sp, FFT/Sn and FFO/P with every clock pulse. After 4 clock pulses the data gets loaded into the registers. Thereafter, SWITCH A of switch boxes SI, S2, S3 of the current IOL are turned ON by signal DYN. Also SWITCH A of SO of the next IOL (IOL3) is turned ON by signal DYN and the 4 bit data is available parallely on nets LI, L2, L3 of the same IOL and LO of next IOL (IOL3).
B.2.2) PAD-to-Core: This mode is different in only one aspect with respect to core-to-core serial to parallel data converter mode. Mux MO is programmed to select LI(LI is output of input buffer) instead of line NO, so that the serial input of data is taken from the IO pad instead of from the core.
Similar to parallel to serial core-to-pad data conversion mode, the advantage of this mode is that it can be used to overcome the shortage of IO pins resource. Instead of using a number of IO pins for parallel data input, it can reduce the number of input pins required by first accepting serial data form only one input pin and then converting the serial data into parallel data before sending it to the core.
B.2.3) Core-to-PAD: This mode has two phases, the first being conversion of 4-bit serial data from the core to parallel data followed by transfer of this parallel data to output buffers. The first phase is the same as for core to core serial to parallel data conversion. In the second phase when data is available on nets LI, L2, L3 and LO these are send to the GENERAL
ROUTING matrix which routes it to any four IO pads operating in output mode and thus the data can be taken out parallely from different output pins.
B.2.4) PAD-to-PAD: This mode also has two phases, in which the first is conversion of 4-bit serial data from the pad to parallel data and then giving this parallel data to the output buffers. The first phase is the same as for pad-to-core serial to parallel data conversion. In the second phase when data is available on nets LI, L2, L3 and LO it is sent to the GENERAL ROUTING matrix which routes it to any four IO pads operating in output mode and thus the data can be taken out parallely from different output pins.
While the above description applies to 4*1 data conversion it is easily extendible to 4*2-bit data conversion, by cascading any two adjacent lOLs of the same IOG or even two adjacent lOLs of two adjacent lOGs, to get 8 flip-flops. The two lOLs can be configured as a 8 bit (i.e. 4*2 bit) data converter. (Connection between lOLs of same IOG and lOLs of different lOGs is defined in figure 6).
Similarly for 4*3-bit or 4*4-bit modes any three or four adjacent lOLs of an IOG or three or four adjacent lOLs of two adjacent lOGs can be cascaded to obtain 12-bit or 16-bit data converter. To have more than 4*4-bit data conversion, lOLs of adjacent lOGs can be cascaded.
It can be seen that in the proposed architecture all the flip-flops in the complete IO ring (all the lOGs) can be connected to each other in a sequence using switch boxes and muxes i.e. the output of the first flip-flop connected to the input of second, the output of the second flip-flop connected to the input of third, third to fourth and so on. In other words, this architecture enables data conversion of any number of bits.
The only requirement of the DATA CONVERSION mode is that to implement a data converter of 'X-bit', 'X' number of flip-flops in a sequence must be available. Also there must be a gap of at least one flip-flop between two separate data converters. For example, to implement two 4-bit data converters than there must be at least one flip-flop between these data converters, which is not utilized in data conversion (this flip-flop can be used in NORMAL mode).
DATA CONVERSION mode does not interrupt direct signaling of IO buffers in NORMAL mode as shown in the direct input case MUX M10 selects the direct input LI and the direct data is supplied to the core by line L4 In case of direct output muxes Mil and M12 select L5 and L6 lines for tristate signals and send these directly to IOI. Similarly mux Ml3 selects line L7 and connects it to the output buffer in IOI through line LO.
C. LUT OPERATION MODE
In this mode the complete IOLG can be configured to operate as a 4-input LUT, provided that all the 16 flip-flops in an IOLG are not used in any other mode. Direct signaling of IO buffers remains possible in this mode.
First referring to figure 3(a), all the 16 flip-flops in an IOLG are transparent latches at the time of configuration. As all the configuration bits are initialized to '0', setting configuration bit CBl equal to '0' will make all the flip-flops independent of the clock behaving as simple latches. These latches are used as the storage elements of a LUT. On configuration the required data for logic implementation is first loaded into the latches and then configuration bit CBl is changed to ' 1' causing all the 16 latches to change to flip-flops and become clock sensitive. However, the stored data does not change, as the inputs to the flip-flops are tri-stated.
These 16 latches have their outputs connected to LUT DECODER to form a 4-input LUT(connection of latches to LUT DECODER is defined in figure 4). Four inputs to the LUT can be configured through bus mux BM to come from GENERAL ROUTING via 4-bit bus R6 or from bus R7. R7 is a 4-bit bus coming from IO pads P1-P4 of the parent IOG through input buffers BUFIN. Similarly output of the LUT L-OUT can go to the core through GENERAL-ROUTING and/or to one, two, three or four of the four IO pads P1-P4 of the parent IOG through output buffers BUFOUT.
Figure 4 shows the complete connectivity of all the components in an IOL. In IOL2 the write lines are LUT writing lines which at the time of configuration are used to load the required bits in the LUT storage cells. At the time of configuration these write lines are selected by muxes MO-M3 (because all the configuration bits are initialized to '0'). AH the 16 flip-flops in an IOLG may be loaded in this way for configuring the LUT for the required four bit
logic. Lines L-IN connect the output of the flip-flops to the LUT DECODER.
As explained earlier, the output of the LUT can be configured to go to the IO pads and/or to the core. Mux Ml3 of IOL2 selects connection of line L-OUT to output line LO for providing LUT output to pad P2. Similarly muxes M13 of IOL1, IOL3 and IOL4 can select line L-OUT to go to the IO pads PI, P3 and P4 directly. Also 4-inputs to the LUT can be configured to come from core or IO pads. In the case when the LUT has 4 inputs directly from the 10 pads, these four inputs can also be passed to the core for some other logic operations. This option of direct signaling of LUT with IO pads, without going to the GENERAL ROUTING, reduces data delays.
It can be seen that in the LUT mode direct signaling of IO buffers in NORMAL mode is not interrupted. Line LI from BDFIN can be selected by mux MIO to go to the core through line L4. Similarly, lines L5, L6 and L7 coming from the core can be selected by muxes Mil, M12 and M13 respectively to go to output buffer BUFOUT.
The requirement for operating in LUT mode is that the required number of flip-flops in an IOLG should be free.
For three variable functions only eight flip-flops will be required. Unused inputs can be tied to ' 1' or '0', as desired and the corresponding flip-flops can be used in NORMAL or DATA CONVERSION modes. These '0' and ' 1' can be generated within the core.
The proposed architecture can also be configured as a DYNAMIC LUT, that is for one clock period it implements a particular logic function of four variables and in another clock period it implements a different logic function of the same four variables by allowing the stored data to change dynamically with the clock. This can be done by connecting lines LO-L3 to lines NO-N3 through switch boxes SO-S3 and selecting lines NO-N3 by muxes MO-M3 as inputs to flip-flops, instead of write lines. Data on lines LO-L3 can be changed and loaded in the flip-flops with the clock pulse according to required logic. Data coming on lines LO-L3 can be generated within the core.
In another embodiment of the invention the LUT-DECODER circuitry is used as a multiplexer. Referring to figure 7 muxes LM can be configured to select lines LO-L3 coming
from the core, to go to the LUT DECODER. The LUT DECODER can be used to multiplex
these lines giving the multiplexed output at line L-OUT, which can be routed to core and/or 10 pad. In this case there is no requirement to have free flip-flops, and they can be used in NORMAL mode (but not any other mode). The Multiplexer can also be 16, 8, 4 or 2 bit input in a manner similar to the LUT case, by supplying a permanent '0' or T to the select lines (called input lines in case of LUT).
we claim:
In an FPGA apparatus, an improvement for enabling the selective utilization of unused flip-flops or other circuit elements in IO cells and unused decoders or other circuit elements in Look Up Tables (LUT), for core logic functions, comprising:
disconnecting means for selectively disconnecting unused circuit elements from the IO pad circuitry or from said LUT circuitry, and connecting means for selectively connecting said disconnected circuit elements either to the connection -matrix of the core logic or between themselves to provide independently configured functions.
An FPGA apparatus as claimed in claim 1 wherein said disconnecting means is Configuration Logic circuitry provided between the internal core logic and IO pad interface circuits or LUTs.
An FPGA apparatus as claimed in claim 1 wherein said connecting means is a routing matrix between internal core logic and said IO pad circuitry or LUT circuitry.
An FPGA apparatus as claimed in claim 1 wherein said unused IO pad flip-flops are configured as serial-to-parallel or parallel-to-serial data converters.
An FPGA apparatus as claimed in claim 1 wherein said unused LUT circuit elements are deployed to implement configurable two or four input logic functions.
An FPGA apparatus as claimed in claim 5 wherein said logic function is a multiplexer function.
An FPGA apparatus as claimed in claim 1 including grouping of said IO pads for enabling configurable complex logic functions.
A method for enabling the utilization of unused flip-flops or other unused circuit elements in IO cells and unused decoders or other circuit elements in Look Up Tables (LUTs) of an FPGA for core logic functions, comprising the steps of.
disconnecting said unused circuit elements from said IO circuitry and / or LUT, and
connecting said disconnected circuit elements to the connection matrix of the core logic or amongst themselves to provide independent functions.
A method as claimed in claim 8 wherein said disconnecting is done by Output Configuration Logic circuitry provided between the core logic and IO pad interface (IOL) circuits or LUT.
A method as claimed in claim 8 wherein said connecting is done by a routing matrix between interval core logic and said IO pad circuitry or LUT circuitry.
A method as claimed in claim 8 used for configuring said unused IO pads flip-flops as parallel-to-serial or serial-to-parallel data converter.
A method as claimed in claim 8 wherein said unused LUT circuit elements are deployed to implement configurable two or four input logic functions.
A method as claimed in claim 12 wherein said logic function is a multiplexer function.
A method as claimed in claim 8 including grouping of said IO pads for enabling configurable complex logic functions.
An FPGA apparatus substantially as herein described with reference to and as illustrated in the accompanying drawings.
A method for enabling the utilization of unused flip-flops or other unused circuit elements in IO cells and unused decoders or other circuit elements in Look Up Tables (LUTs) of an FPGA for core logic functions substantially as herein described with reference to and as illustrated in the accompanying drawings
| # | Name | Date |
|---|---|---|
| 1 | 35-del-2002-abstract.pdf | 2011-08-21 |
| 1 | 35-del-2002-petiton-other.pdf | 2011-08-21 |
| 2 | 35-del-2002-claims.pdf | 2011-08-21 |
| 2 | 35-del-2002-form-3.pdf | 2011-08-21 |
| 3 | 35-del-2002-form-2.pdf | 2011-08-21 |
| 3 | 35-del-2002-correspondence-others.pdf | 2011-08-21 |
| 4 | 35-del-2002-form-18.pdf | 2011-08-21 |
| 4 | 35-del-2002-correspondence-po.pdf | 2011-08-21 |
| 5 | 35-del-2002-discription (complete).pdf | 2011-08-21 |
| 5 | 35-del-2002-form-1.pdf | 2011-08-21 |
| 6 | 35-del-2002-drawings.pdf | 2011-08-21 |
| 7 | 35-del-2002-discription (complete).pdf | 2011-08-21 |
| 7 | 35-del-2002-form-1.pdf | 2011-08-21 |
| 8 | 35-del-2002-correspondence-po.pdf | 2011-08-21 |
| 8 | 35-del-2002-form-18.pdf | 2011-08-21 |
| 9 | 35-del-2002-correspondence-others.pdf | 2011-08-21 |
| 9 | 35-del-2002-form-2.pdf | 2011-08-21 |
| 10 | 35-del-2002-form-3.pdf | 2011-08-21 |
| 10 | 35-del-2002-claims.pdf | 2011-08-21 |
| 11 | 35-del-2002-petiton-other.pdf | 2011-08-21 |
| 11 | 35-del-2002-abstract.pdf | 2011-08-21 |