Abstract: Described is a computer implemented method for reducing computation time of FFT of data values in a multiprocessor system (400). The data values are stored in a memory in a form of a 2-dimension (2D) squared matrix of order N. The computer implemented method includes computing a DBNS series to represent the order N, where the DBNS series comprises one or more summands of 2s-3t, and where exponents s and t are integers. The computer implemented method also includes determining a sum-total value S of all summands having non-zero exponents of 3 in the DBNS series, distributing data values of S number of rows of the 2D squared matrix of order N into one or more squared matrices of order 3, and performing FFT on the one or more squared matrices of order 3 using at least one base-3 processor (402) of the multiprocessor system (400). [[To be published with Figure 2]]
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See section 10, rule 13)
1. Title of the invention: DOUBLE-BASE NUMBER SYSTEM BASED FAST FOURIER
TRANSFORM IN MULTIPROCESSOR SYSTEMS
2. Applicants)
NAME NATIONALITY ADDRESS
TATA CONSULTANCY Indian Nirmal Building, 9th Floor,
SERVICES LIMITED Nariman Point, Mumbai,
Maharashtra 400021,
India
3. Preamble to the description
COMPLETE SPECIFICATION
The following specification particularly describes the invention and the manner in which it
is to be performed.
TECHNICAL FIELD
[001] The present subject matter relates to Fast Fourier Transform (FFT)
in multiprocessor systems and, particularly but not exclusively, to Double-Base Number System (DBNS) based FFT in multiprocessor systems.
BACKGROUND
[002] Discrete Fourier Transforms (DFTs) are known and find
applications in digital signal processing in computing systems. DFTs are
applicable for processing of signals that are discrete and periodic and the data in
such signals can be represented as Fourier transforms. It is well known that the
direct computation of DFT requires a large number of operations. Thus, the direct
computation of DFT requires a substantially large amount of processing, which
causes an increase in the computation time for processing of digital signals.
[003] In order to reduce the amount of processing and the computation
time for processing of digital signals, Fast Fourier Transforms (FFTs) are typically used for computation of DFTs in computing systems. It may be evident that while the computation of DFTs is faster using FFTs, it may still be important to further reduce the computation time of FFTs for speeding up the processing of digital signals computing systems.
SUMMARY
[004] This summary is provided to introduce concepts related to Double-
Base Number System (DBNS) based Fast Fourier Transform (FFT) in multiprocessor systems. This summary is neither intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.
[005] In accordance with an embodiment of the present subject matter, a
computer implemented method for reducing computation time of FFT of data values in a multiprocessor system is described. The data values are stored in a memory in a form of a 2-dimension (2D) squared matrix of order N. The
computer implemented method includes computing a DBNS series to represent the order N, where the DBNS series comprises one or more summands of 2s-3, and where exponents s and t are integers. The computer implemented method also includes determining a sum-total value S of all summands having non-zero exponents of 3 in the DBNS series, distributing data values of S number of rows of the 2D squared matrix of order N into one or more squared matrices of order 3, and performing FFT on the one or more squared matrices of order 3 using at least one base-3 processor of the multiprocessor system.
BRIEF DESCRIPTION OF DRAWINGS
[006] The detailed description is described with reference to the
accompanying figures. In the figures, the left-most digit(s) of a reference number ■
identifies the figure in which the reference number first appears. The same
numbers are used throughout the figures to reference like features and
components. Some embodiments of system and/or methods in accordance with
embodiments of the present subject matter are now described, by way of example
only, and with reference to the accompanying figures, in which:
[007] Figure 1 illustrates a distribution of a 2-dimansion matrix of data
values for performing Fast Fourier Transform (FFT) according to on a
conventional methodology based on parallel processing.
[008] Figure 2 illustrates a method for reducing computation time of FFT
of data values in a multiprocessor system, according to an embodiment of the
present subject matter.
[009] Figure 3 illustrates a distribution of a 2-dimansion matrix of data
values for performing Fast Fourier Transform (FFT), according to an embodiment
of the present subject matter.
[0010] Figure 4 illustrates a multiprocessor system, according to an
embodiment of the present subject matter.
[0011] It should be appreciated by those skilled in the art that any block
diagrams herein represent conceptual views of illustrative systems embodying the
principles of the present subject matter. Similarly, it will be appreciated that any
flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DETAILED DESCRIPTION
[0012] The present subject matter relates to computer implemented
methods for reducing computation time of FFT of data values and multiprocessor systems implementing such methods. The data values may be understood as data carried by discrete and periodic digital signals on which the FFT is to be performed.
[0013] The data values on which the FFT is to be performed may be in a
form of a 2-dimension (2D) array or matrix. In a conventional method for
performing an FFT on such 2D array or matrix of data values, at first the FFT is
performed for all the one-dimension rows of the data values, followed by
row/column transposition of the 2D array or matrix, and followed by performing
the FFT for all the one-dimension columns of the data values. The row/column
transposition is an operation in which the rows of the 2D array or matrix are
written as columns. Such a conventional method of performing the FFT on the 2D
array or matrix of data values requires a large number of operations, which affects
the computation time of FFT. Also, for a squared 2D array or matrix of order n,
number of steps involved in the row/column transposition are n*(n-l)/2. These
additional steps performed for the operation of row/column transposition during
the FFT incur a computational delay, which affects the processing speed of the
computing system. Further, with such a conventional method implemented in a
computing system with a single processor, the computation time is substantially
large, which affects the processing speed of the computing system.
[0014] In another conventional method, the FFT on 2D array or matrix of
data values is performed through parallel processing. Such a conventional method is implemented in computing systems having multiple processors. In the computing systems with multiple processors, depending on the number of
processors and the number or row/columns in the 2D array or matrix, one or more
row/columns are allotted to each of the multiple processors for processing. With
this configuration and methodology, FFT is performed simultaneously, i.e., in
parallel, on multiple rows/columns by the multiple processors. Further, in the case
of parallel processing, row/column transposition of the 2D array or matrix is also
performed after the FFT on rows and before the FFT on columns. This incurs a
computational delay and affects the processing speed of the computing system.
[0015] Conventionally, a computing system having multiple processors
may be based on shared-memory implementation or distributed memory implementation. In the shared-memory implementation, all processors share the same physical data memory for processing of data. In the distributed-memory implementation, all processors have separate physical data memory for processing of data. Tn the computing system with multiple processors, whether based on share-memory implementation or distributed-memory implementation, the 2D array or matrix of data values is distributed in a predefined manner and then allotted to the multiple processors for performing the FFT.
[0016] Figure 1 illustrates a distribution of a 2D matrix of data values for
performing FFT according to on a conventional methodology based on parallel processing. The conventional methodology is described for FFT of a squared 2D matrix of order 8 using 4 processors in the computing system. As illustrated in Figure 1, for the 2D matrix of order 8, the 2D matrix is distributed into 2x2 matrices from A to P. In case the computing systems is based on the share-memory implementation, the conventional methodology for performing FFT on the 2D matrix follows the following procedure:
Step 1: Each of the rows with 2x2 matrices is allotted to an individual
processor from the 4 processors, and FFT on the each of the allotted
rows with the 2x2 matrices is performed, in parallel, by the
corresponding individual processor. Step 2: Row/column transposition of 2D matrix is performed. Step 3: Each of the columns with 2x2 matrices is allotted to an individual
processor from the 4 processors, and FFT on the each of the allotted
columns with the 2x2 matrices is performed, in parallel, by the
corresponding individual processor,
[0017] Alternatively, in case the computing systems is based on the
distributed-memory implementation, the conventional methodology for performing FFT on the 2D matrix follows the following procedure:
Step 1: Each of the rows with 2x2 matrices is allotted to an individual
processor from the 4 processors, and FFT on the each of the allotted
rows with the 2x2 matrices is performed, in parallel, by the
corresponding individual processor, Step 2: Total exchange is performed by interchanging the matrices B-E,
matrices C-I, matrices G-J, matrices D-M, matrices H-N, and matrices
L-O. Step 3: Row/column transposition of each individual 2x2 matrix is performed. Step 4: Again, each of the rows with 2x2 matrices is allotted to an individual
processor from the 4 processors, and FFT on the each of the allotted
rows with the 2x2 matrices is performed, in parallel, by the
corresponding individual processor.
[0018] Although, the conventional methodology of performing FFT on a
2D matrix based on parallel processing using multiple processors is efficient and faster in comparison to the conventional methodology based on processing using a single processor, the number of steps involved in the row/column transposition operation in the shared-memory implemented computing system, and the additional steps of the combination of total exchange and row/column transposition in the distributed-memory implemented computing system, as described above, incur a computational delay, which affects the processing speed of the computing system.
[0019] The present subject matter describes computer implemented
methods for reducing computation time of FFT of data values and multiprocessor systems implementing such methods. For the purposes of the present subject matter, the data values, on which the FFT is to be performed, are stored in a memory, for example, a Random-Access Memory (RAM), in a form of a 2-
dimension (2D) squared matrix of order N. Also, for the purposes of the present subject matter, the multiprocessor systems include at least one base-3 processor and at least one base-2 processor. A base-3 processor may be understood as a processor that processes digital data represented using a base-3 number system, and a base-2 processor may be understood as a processor that processes digital data represented using a base-2 number system.
[0020] The methods and the multiprocessor systems of the present subject
matter utilize Double-base Number System (DBNS) for performing FFT of the data values. With the method of the present subject matter, the number of steps for performing the FFT, particularly for the row/column transposition and/or the total exchange steps in the FFT, is lower in comparison to the conventional methods. This facilitates in reducing the computation time of FFT and hence improving the processing speeds in the multiprocessor systems in comparison to conventional multiprocessor systems.
[0021] In an implementation, based on the order N of the 2D squared
matrix of data values, a DBNS series is computed to represent the order N. The DBNS series is computed to comprise one or more summands of 2s-3', where s and t are integers, and the order N is the sum of all the summands. In an implementation, the DBNS series may be computed in a conventional manner as known to a skillful person.
[0022] After determining the DBNS series, a sum-total value S of all
summands, in the DBNS series, having non-zero exponents of 3 is determined.
Based on the sum-total value S, data values of S number of rows of the 2D
squared matrix are distributed into squared matrices of order 3, i.e., 3x3 matrices.
In an implementation, depending of the order N and the sum-total value S, the 2D
squared matrix may have one or more squared matrices of order 3.
[0023] Further, in an implementation, in case there are any data values left
over outside the matrices of order 3, i.e., the data values left un-distributed as a part of one of the matrices of order 3, such data values, hereinafter referred to as the left-over data values of the 2D squared matrix, may be distributed into one or more squared matrices of order 2 and/or one or more matrices of order 1.
[0024] After distributing the data values of the 2D squared matrix, the
squared matrices of order 3 are allocated to at least one base-3 processor for performing the FFT thereon. Based on the allocation, the FFT is performed on the squared matrices of order 3 using the base-3 processor. In an implementation, as the case may be, the one or more squared matrices of order 2 and the one or more matrices of order 1, which include the left-over data values, are allocated to at least one base-2 processor, and the FFT is performed of such matrices, in parallel, using the base-2 processor.
[0025] Further, in an implementation, the multiprocessor system of the
present subject matter may be based on a shared-memory implementation or
based on a distributed-memory implementation, and, as the case may be, the FFT
on the squared matrices of order 3, the squared matrices of order 2 and the
matrices of order 1 may be performed and computed accordingly.
[0026] With the distribution of the 2D matrix of data values according to
the method of present subject matter, the number of steps for performing the FFT, particularly for the row/column transposition and/or the total exchange steps in the FFT, is lower in comparison to the conventional methods. Thus, the computation time of FFT of the 2D matrix, in accordance with the method of the present subject matter, is lower, which facilitates in improving the processing speeds in the multiprocessor systems in comparison to conventional multiprocessor systems.
[0027] Further, in an implementation, based on the method of the present
subject matter, the FFT can be performed on a 2D squared matrix of data values of
any order. In an implementation, the order N may be a non-zero exponent of 2.
[0028] These and other advantages of the present subject matter would be
described in greater detail in conjunction with the following figures. It should be noted that the description and figures merely illustrate the principles of the present subject matter.
[0029] Figure 2 illustrates a method 200 for reducing computation time of
FFT of data values in a multiprocessor system, according to an embodiment of the present subject matter. The multiprocessor system implemented to execute the
method 200 is described later in the description with reference to Figure 4. The method 200 may be executed in a multiprocessor system comprising a RAM, on which the data values are stored for processing and computation of FFT, and comprising at least one base-3 and at least one base-2 processor using which the FFT of the data values is performed.
[0030] The method 200 may be described in the general context of
computer executable instructions. Generally, computer executable instructions can
include routines, programs, objects, components, data structures, procedures,
modules, and functions that perform particular functions or implement particular
abstract data types. The method 200 may also be practiced in a distributed
computing environment where functions are performed by remote processing
devices that are linked through a communications network. In a distributed
computing environment, computer executable instructions may be located in both
local and remote computer storage media, including memory storage devices.
[0031] The order in which the method 200 is described is not intended to
be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or an alternative method. Additionally, individual blocks may be deleted from the method without departing from the spirit and scope of the subject matter described herein. Furthermore, the method 200 can be implemented in any suitable hardware, software, firmware, or combination thereof.
[0032] In an implementation, for the purpose of performing of FFT on the
data values, the data values are stored in a form of a 2D squared matrix of order N on the RAM. It may be understood that the size or order of data values, on which the FFT can be performed in one instance, can be either equal to or less than the size of the RAM.
[0033] At block 202, a DBNS series is computed to represent the order N
of the 2D squared matrix of data values. The DBNS series may be computed after storing the data values on the RAM. The DBNS series is computed to distribute the order N as one or more summands of 2s-3l, where s and t are integers. As
mentioned earlier, the DBNS series may be computed in a conventional manner as known to a skillful person.
[0034] In an implementation, for computing the DBNS series the order N
is at first distributed to determine one or more summands having non-zero
exponents of 3. Subsequently, the value from the order N which is left un
distributed is distributed to determine one or more summands with zero exponents
of 3. In an example, if the order N is 8, the DBNS series is 21-31 + 21-3°. In
another example, if the order N is 32, the DBNS series is 23-3' + 21-31 + 22-3°.
Similarly, if the order N is 128, the DBNS series is 22-33 + 2'-32 + 21-3°.
[0035] At block 204, a sum-total value S of all the summands having non-
zero exponents of 3 is determined. After determining the sum-total value S, data values of S number of rows of the 2D squared matrix are distributed, at block 206, into squared matrices of order 3, i.e., 3x3 matrices. In an implementation, the first S rows of the 2D squared matrix may be distributed into the squared matrices of order 3. The number of squared matrices of order 3, in which S rows are distributed, is equal to (S/3)2. In an example, in the DBNS series for the order N = 8 as mentioned above, the sum-total value S of summands having non-zero exponents of 3 is 6. Based on this, data values of 6 rows of the 2D squared matrix of order N = 8 are distributed into 4 squared matrices of order 3. In another example, in the DBNS series for the order N = 32 as mentioned above, the sum-total value S of summands having non-zero exponents of 3 is 30. Based on this, data values of 30 rows of the 2D squared matrix of order N = 32 are distributed into 100 squared matrices of order 3.
[0036] In an implementation, at block 208, the left-over data values of the
2D squared matrix, i.e., the data values which are left un-distributed and outside the squared matrices of order 3, are distributed into one or more squared matrices of order 2 and/or one or more matrices of order 1. In an example, as mentioned above, if the order N is 8, first 6 rows of 2D matrix of data values may be distributed into 4 squared matrices of order 3. In this case, the left-over data values in the last 2 rows and in the last 2 columns may then be distributed either into matrices of order 2 or into matrices of order 1 or a combination thereof.
[0037] At block 210, based on the distribution of the 2D squared matrix,
the squared matrices of order 3 are allocated to the at least one base-3 processor,
and the squared matrices of order 2 and/or the matrices of order 1 are allocated to
the at least one base-2 processor, for performing the FFT thereon.
[0038] In an implementation, the at least one base-3 processor may
include a plurality of sub-processors. Each of the sub-processors of the base-3 processor is allocated with one or more individual matrices of order 3 for performing the FFT.
[0039] Similarly, in an implementation, the at least one base-2 processor
may include a plurality of sub-processors. Each of the sub-processors of the base-2 processor is allocated with one or more individual matrices of order 2 and/or one or more individual matrices of order 1, as the case may be, for performing the FFT.
[0040] At block 212, based on the allocation, the FFT is performed on the
squared matrices of order 3 using the at least one base-3 processor, or using the plurality of sub-processors of the at least one base-3 processor. At block 214, based on the allocation, the FFT is performed on the squared matrices of order 2 and/or the matrices of order 1 using the at least one base-2 processor, or using the plurality of sub-processors of the at least one base-2 processor. In an implementation, FFT on the squared matrices of order 3, FFT on the squared matrices of order 2 and FFT of matrices of order 1 are performed in parallel by the processors of the multiprocessor system.
[0041] As it is understood that the data values, for the purpose of
processing and computation of FFT, are stored in RAM. The order N of 2D
squared matrix of data values, on which the FFT is to be performed depends on
the size of RAM. The size of RAM is typically represented as a bit-size or byte-
size, which may be of a size of the order of a predefined exponent of 2. In an
example, the size of RAM may be from 21 bit to about 2 gigabyte (GB).
[0042] Typically, the RAMs are of sizes of 8x8 bits (64 or 26 bits), 16x16
bits (256 or 28 bits), 32x32 (1024 or 210 bits) and so on, such that a RAM are hardwared as a 2D squared matrix of order 2n, where n is 3 or more. With this
limitation of the size of RAM, the data values are stored on the RAM as a 2D squared matrix of order N, where the order N may be equal to 23 or more, and the FFT is largely performed on the data values stored in the form of such 2D squared matrices.
[0043] The description hereinafter describes the methodology of the
present subject matter for performing FFT on data values in the form of a 2D squared matrix of order N = 8 in a multiprocessor system having one base-3 processor and one base-2 processor. The base-3 and the base-2 processors may, respectively, have a plurality of sub-processors. It is to be understood that the example for the 2D squared matrix of order N = 8 is only for the purpose of explanation, and FFT of data values in the form of 2D squared matrix of other orders N is performed in a similar manner.
[0044] For the order N = 8, the DBNS series is computed as 2l-3l + 2,-3°.
From the DBNS series, the sum-total value S is determined as 6. Based on the sub-total value S, 6 rows of the 2D squared matrix are into squared matrices of order 3, and the left-over data values are distributed into squared matrices of order 2. Figure 3 illustrates a distribution of data values in the form of 2D matrix of order N = 8 for performing FFT, according to an embodiment of the present subject matter. As shown in Figure 3, first 6 rows of data values are distributed into 4 squared matrices of order 3 from A' to D', and the left-over data values in the last 2 rows and in the last 2 columns are distributed into 7 squared matrices of order 2 from E'to K'.
[0045] Based on the distribution of data values as shown in Figure 3, the
squared matrices of order 3 A' to D' are allocated to the base-3 processor or the
sub-processors thereof, and the squared matrices of order 2 E' to K' are allocated
to the base-2 processor or the sub-processors thereof, for performing the FFT. The
FFT is then performed by the base-3 and the base-2 processors in parallel.
[0046] As mentioned earlier, for the computation of FFT of a squared
matrix of order n based on the conventional methods, the number of row/column transposition steps performed are n*{n-l)/2. Thus, in the conventional methods of performing FFT on a squared matrix of order 8, the number of row/column
transposition steps performed is 28. However, based on the method of present
subject matter and the distribution of the squared matrix of order N = 8 based on
the DBNS, as shown in Figure 3, the number of row/column transposition steps
performed is 23. The reduction in the number of row/column transposition steps
for performing the FFT facilitates in reducing the computation time of FFT. As a
result, the processing speed of the multiprocessor system is increased.
[0047] Table 1 illustrates values of Timing Benchmarks, also understood
as computation time, for FFT on 2D squared matrix of various orders N when performed in accordance with the method of the present subject matter. The values of Timing Benchmarks are given in milliseconds (ms) for orders N = 8, 16, 32, 64, and 128 when computed in a multiprocessor system having one base-3 processor and one base-2 processor, where the combination of base-3 and base-2 processors is partitioned into 2 or 4 sub-processors.
TABLE 1
Timing Timing Timing Timing Timing
No. of Sub- Benchmark Benchmark Benchmark Benchmark Benchmark
for 2D squared for 2D squared for 2D squared for 2D squared for 2D squared
processors matrix of matrix of matrix of matrix of matrix of
order 8 order 16 order 32 order 64 order 128
2 0.32 ms 0.62 ms 1.45 ms 2.37 ms 3.21 ms
4 0.28 ms 0.59 ms 1.38 ms 2.32 ms 3.19ms
[0048] Figure 4 illustrates a multiprocessor system 400, according to an
embodiment of the present subject matter. In an implementation, the multiprocessor system 400 implements the method 200 for reducing computation time of FFT of data values as described earlier in the description. The multiprocessor system 400 includes, but is not limited to, a computing device such as a mainframe computer, a workstation, a personal computer, a desktop computer, a minicomputer, a server, a laptop, and a tablet; a mobile
communication device such as a personal digital assistant, a smart phone, and a mobile phone; a Digital Signal Processing based device such as a hearing aid device, and the like.
[0049] The multiprocessor system 400, hereinafter referred to as the
system 400, includes one or more base-3 processor(s) 402, one or more base-2 processor(s) 404, interface(s) 406, and a memory 408 coupled to the processors 402, 404. As described earlier, the base-3 processor 402 is a processor that processes digital data represented using a base-3 number system, and the base-2 processor 404 is a processor that processes digital data represented using a base-2 number system.
[0050] Further, each of the processors 402, 404 can be a single processor
unit or a number of units, all of which could include multiple computing units. The processor(s) 402, 404 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) 402, 404 is configured to fetch and execute computer-readable instructions and data stored in the memory 408.
[0051] Functions of the various elements shown in Figure 4, including the
functional blocks labeled as "processor(s)", may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, or by a plurality of sub-processors. Moreover, explicit use of the term "processor" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, with a limitation, Digital Signal Processor (DSP) hardware, network processor, Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), Read Only Memory (ROM) for storing software, Random Access Memory (RAM), and non-volatile storage. Other hardware, conventional or custom, may also be included. Further, the processors
402, 404 may include various hardware components, such as adders, shifters, sign
correctors, and generators required for executing various applications such as
arithmetic operations.
[0052] The interface(s) 406 may include a variety of software and
hardware interfaces, for example, interfaces for peripheral device(s), such as a
keyboard, a mouse, an external memory, and a printer. The interface(s) 406 may
enable the system 400 to communicate with other devices, such as external
computing devices and external databases.
[0053J The memory 408 may include any computer-readable medium
known in the art including, for example, volatile memory such as static random
access memory (SRAM) and dynamic random access memory (DRAM), and/or
non-volatile memory, such as read only memory (ROM), erasable programmable
ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[0054] The memory 408 includes module(s) 410 and data 412. The
modules 410 include routines, programs, objects, components, data structures, and
the like, which perform particular tasks or implement particular abstract data
types. The modules 410 further include modules that supplement applications on
the system 400, for example, modules of an operating system. The data 412,
amongst other things, serves as a repository for storing data that may be
processed, received, or generated by one or more of the modules 410.
[0055] In an implementation, the modules 410 of the system 400 include a
computation module 414, an allocation module 416, and other module(s) 418. The
other module(s) 418 may include programs or coded instructions that supplement
applications and function, for. example, programs in the operating system of the
system 400.
[0056] In an implementation, data 412 include processing data 420,
computation data 422, and other data 424. The other data 424 includes data
generated as a result of the execution of one or more modules in the other
module(s)4I8.
[0057] The computation module 414 is configured to store the data values
in the processing data 420 of the memory 408 for performing FFT on the data
values. The processing data 420 may be the RAM of the system 400. The data values are stored in the processing data 420 in the form of a 2D squared matrix. In an implementation, in case the data values do not form a squared matrix, a number of null data values may be stored in the processing data 420 to form one or more extra rows and columns adjacent to the data values, to make the data values form a 2D squared matrix. The order of the 2D squared matrix of data values may be N, where N is an integer. In an implementation, the order N is an integer with a nonzero exponent of 2.
[0058] For the purpose of performing FFT on the stored data values, the
computation module 414 computes a DBNS series to represent the order N as a sum of one or more summands of 2s-3l, where s and t are integers. The DBNS series may be computed in a manner as described earlier in the description. Based on the DBNS series, the computation module 414 then determines a sum-total value S of all the summands, in the DBNS series, having non-zero exponents of 3. The data related to the DBNS series and the sum-total value S may be stored in computation data 422 by the computation module 414.
[0059] Further, based on the determined sum-total value S, the
computation module 414 distributes data values of S number of rows of the 2D
squared matrix of data values into squared matrices of order 3. In case all the data
values are not distributed into the squared matrices of order 3, the computation
module 414 distributes the left-over data values of the 2D squared matrix into at
least one of one or more squared matrices of order 2 and one or more matrices of
order 1. The example illustrating the distribution of a 2D squared matrix of order
8 is described with reference to Figure 3. However, 2D matrices of other orders
are also distributed in a similar manner as explained above.
[0060] Based on the distribution of the data values, the allocation module
416 then is configured to allocate the squared matrices of order 3 to the.at least one base-3 processor 402 for performing the FFT thereon, and the computation module 414 is configured to perform FFT on the squared matrices of order 3 using the at least one base-3 processor 402. In an implementation, the at least one base-3 processor 402 may include a plurality of sub-processors (not shown). In such a
case, each of the sub-processors may be allocated with one or more of the squared matrices of order 3 for performing the FFT.
[0061] In an implementation, where the 2D matrix of data values is
distributed to include one or more squared matrices of order 2 and/or one or matrices of order 1, the allocation module 416 is configured to allocate those matrices to the at least one base-2 processor 404 for performing the FFT thereon, and the computation module 414 is configured to perform FFT on the squared matrices of order 2 and/or on the matrices of order 1 using the at least one base-2 processor 404. In an implementation, the at least one base-2 processor 404 may include a plurality of sub-processors (not shown). In such a case, each of the sub-processors may be allocated with one or more of the squared matrices of order 2 and/or the matrices of order 1 for performing the FFT.
[0062] In an implementation, the system 400 may be a handheld
computing device or a mobile communication device, such as a mobile phone, in
which FFT is to be performed on data values. The FFT may be performed on the
data values for the purpose of image processing, audio filtering, etc. By using the
DBNS series for distribution of data values into one or more squared matrices of
order 3, and one or more squared matrices of order 2 and one or matrices of order
1 in addition, as the case may be, the computation time of FFT and hence the
processing speed of the system 400 is improved as the number of row/column
transposition steps required for performing the FFT are reduced in comparison to
the number of steps required in the conventional systems and methods.
[0063] Although embodiments for the method and multiprocessor system
have been described in language specific to structural features, it is to be
understood that the invention is not necessarily limited to the specific features
described. Rather, the specific features are disclosed and explained in the context
of a few embodiments for the method and multiprocessor system.
[0064] Other advantages of the method and multiprocessor system of the
present subject matter will become better understood from the description and claims of an exemplary embodiment of the method and multiprocessor system.
The method and multiprocessor system of the present subject matter are not
restricted to the embodiments that are mentioned above in the description.
[0065] Although the subject matter has been described with reference to
specific embodiments, this description is not meant to be construed in a limiting sense. Various modifications of the disclosed embodiments, as well as alternate embodiments of the subject matter, will become apparent to persons skilled in the art upon reference to the description of the subject matter. It is therefore contemplated that such modifications can be made without departing from the spirit or scope of the present subject matter as defined.
I/we claim:
1. A computer implemented method for reducing computation time of fast
fourier transform (FFT) of data values in a multiprocessor system (400),
wherein the data values are stored in a memory in a form of a 2-dimension
(2D) squared matrix of order N, the method comprising:
computing a double-base number system (DBNS) series to represent the order N, wherein the DBNS series comprises one or more summands of 2s-3', and wherein exponents s and t are integers;
determining a sum-total value S of all summands, in the DBNS series, having non-zero exponents of 3;
distributing data values of S number of rows of the 2D squared matrix of order N into one or more squared matrices of order 3; and
performing FFT on the one or more squared matrices of order 3 using at least one base-3 processor (402) of the multiprocessor system (400).
2. The method as claimed in claim 1 further comprising:
distributing left-over data values in the 2D squared matrix into at least one of one or more squared matrices of order 2 and one or more matrices of order 1; and
performing FFT on the one or more squared matrices of order 2 and the one or more matrices of order 1 using at least one base-2 processor (404) of the multiprocessor system (400).
3. The method as claimed in claim 2, wherein the performing the FFT on the one or more squared matrices of order 3, on the one or more squared matrices of order 2, and on the one or more matrices of order 1 are done in parallel.
4. The method as claimed in claim 1, wherein the order N is an integer with a non-zero exponent of 2.
5. A multiprocessor system (400) comprising:
a plurality of processors comprising at least one base-3 processor (402) and at least one base-2 processor (404); and
a memory (408) coupled to the plurality of processors, the memory (408) comprising:
a computation module (414) configured to,
store data values in the memory for performing fast fourier transform (FFT) of the data values, wherein the data values are stored in a form of a 2-dimention squared matrix of order N;
compute a double-base number system (DBNS) series to represent the order N, wherein the DBNS series comprises one or more summands of 2S3', and wherein exponents s and t are integers;
determine a sum-total value S of all summands, in the DBNS series, having non-zero exponents of 3; and
distribute data values of S number of rows of the 2D squared matrix of order N into one or more squared matrices of order 3; and
an allocation module (416) configured to,
allocate the one or more squared matrices of order 3 to the at least one base-3 processor (402) for performing the FFT.
6. The multiprocessor system (400) as claimed in claim 5, wherein the computation module (414) is further configured to perform FFT on the one or more squared matrices of order 3 using the at least one base-3 processor (402).
7. The multiprocessor system (400) as claimed in claim 5, wherein the computation module (414) is further configured to distribute left-over data values of the 2D squared matrix into at least one of one or more squared matrices of order 2 and one or more matrices of order 1, and wherein the allocating module (416) is further configured to allocate the one or more
squared matrices of order 2 and to the one or more matrices of order 1 to the at least one base-2 processor (404) for performing the FFT.
8. The multiprocessor system (400) as claimed in claim 7, wherein the computing module (414) is further configured to perform FFT on the one or more squared matrices of order 2 and the one or more matrices of order 1 using the at least one base-2 processor (404).
9. The multiprocessor system (400) as claimed in claim 5, wherein the order N is an integer with a non-zero exponent of 2.
10. The multiprocessor system (400) as claimed in claim 5, wherein the at least one base-3 processor (402) comprises a plurality of sub-processors, and wherein the plurality of sub-processors performs FFT on the squared matrices of order 3.
11. The multiprocessor system (400) as claimed in claim 5, wherein the at least one base-2 processor (404) comprises a plurality of sub-processors, and wherein the plurality of sub-processors performs FFT on the squared matrices of order 2 and the matrices of order 1.
12. A computer-readable medium having computer-executable instructions that when executed perform acts comprising:
computing a double-base number system (DBNS) series to represent order N of a 2D squared matrix of data values, wherein the DBNS series comprises one or more summands of 2s-3l, and wherein exponents s and t are integers;
determining a sum-total value S of all summands, in the DBNS series, having non-zero exponents of 3;
distributing data values of S number of rows of the 2D squared matrix of order N into one or more squared matrices of order 3; and
performing FFT on the one or more squared matrices of order 3 using at least one base-3 processor of a multiprocessor system.
13. The computer-readable medium as claimed in claim 12 further performs acts comprising:
distributing left-over data values in the 2D squared matrix into at least one of one or more squared matrices of order 2 and one or more matrices of order
1; and
performing FFT on the one or more squared matrices of order 2 and the one or more matrices of order 1 using at least one base-2 processor of the multiprocessor system.
| # | Name | Date |
|---|---|---|
| 1 | 2141-MUM-2012-Response to office action (Mandatory) [18-06-2019(online)].pdf | 2019-06-18 |
| 1 | ABSTRACT 1.jpg | 2018-08-11 |
| 2 | 2141-MUM-2012-FORM 3.pdf | 2018-08-11 |
| 2 | 2141-MUM-2012-AbandonedLetter.pdf | 2019-06-13 |
| 3 | 2141-MUM-2012-FORM 2[TITLE PAGE].pdf | 2018-08-11 |
| 3 | 2141-MUM-2012-CLAIMS [24-04-2019(online)].pdf | 2019-04-24 |
| 4 | 2141-MUM-2012-FORM 26(3-9-2012).pdf | 2018-08-11 |
| 4 | 2141-MUM-2012-COMPLETE SPECIFICATION [24-04-2019(online)].pdf | 2019-04-24 |
| 5 | 2141-MUM-2012-FORM 2.pdf | 2018-08-11 |
| 5 | 2141-MUM-2012-FER_SER_REPLY [24-04-2019(online)].pdf | 2019-04-24 |
| 6 | 2141-MUM-2012-OTHERS [24-04-2019(online)].pdf | 2019-04-24 |
| 6 | 2141-MUM-2012-FORM 18(30-7-2012).pdf | 2018-08-11 |
| 7 | 2141-MUM-2012-FORM 4(ii) [05-04-2019(online)].pdf | 2019-04-05 |
| 7 | 2141-MUM-2012-FORM 1.pdf | 2018-08-11 |
| 8 | 2141-MUM-2012-FORM 1(30-7-2012).pdf | 2018-08-11 |
| 8 | 2141-MUM-2012-FER.pdf | 2018-09-28 |
| 9 | 2141-MUM-2012-DRAWING.pdf | 2018-08-11 |
| 9 | 2141-MUM-2012-ABSTRACT.pdf | 2018-08-11 |
| 10 | 2141-MUM-2012-CLAIMS.pdf | 2018-08-11 |
| 10 | 2141-MUM-2012-DESCRIPTION(COMPLETE).pdf | 2018-08-11 |
| 11 | 2141-MUM-2012-CORRESPONDENCE(3-9-2012).pdf | 2018-08-11 |
| 11 | 2141-MUM-2012-CORRESPONDENCE.pdf | 2018-08-11 |
| 12 | 2141-MUM-2012-CORRESPONDENCE(30-7-2012).pdf | 2018-08-11 |
| 13 | 2141-MUM-2012-CORRESPONDENCE(3-9-2012).pdf | 2018-08-11 |
| 13 | 2141-MUM-2012-CORRESPONDENCE.pdf | 2018-08-11 |
| 14 | 2141-MUM-2012-CLAIMS.pdf | 2018-08-11 |
| 14 | 2141-MUM-2012-DESCRIPTION(COMPLETE).pdf | 2018-08-11 |
| 15 | 2141-MUM-2012-ABSTRACT.pdf | 2018-08-11 |
| 15 | 2141-MUM-2012-DRAWING.pdf | 2018-08-11 |
| 16 | 2141-MUM-2012-FER.pdf | 2018-09-28 |
| 16 | 2141-MUM-2012-FORM 1(30-7-2012).pdf | 2018-08-11 |
| 17 | 2141-MUM-2012-FORM 1.pdf | 2018-08-11 |
| 17 | 2141-MUM-2012-FORM 4(ii) [05-04-2019(online)].pdf | 2019-04-05 |
| 18 | 2141-MUM-2012-FORM 18(30-7-2012).pdf | 2018-08-11 |
| 18 | 2141-MUM-2012-OTHERS [24-04-2019(online)].pdf | 2019-04-24 |
| 19 | 2141-MUM-2012-FER_SER_REPLY [24-04-2019(online)].pdf | 2019-04-24 |
| 19 | 2141-MUM-2012-FORM 2.pdf | 2018-08-11 |
| 20 | 2141-MUM-2012-FORM 26(3-9-2012).pdf | 2018-08-11 |
| 20 | 2141-MUM-2012-COMPLETE SPECIFICATION [24-04-2019(online)].pdf | 2019-04-24 |
| 21 | 2141-MUM-2012-FORM 2[TITLE PAGE].pdf | 2018-08-11 |
| 21 | 2141-MUM-2012-CLAIMS [24-04-2019(online)].pdf | 2019-04-24 |
| 22 | 2141-MUM-2012-FORM 3.pdf | 2018-08-11 |
| 22 | 2141-MUM-2012-AbandonedLetter.pdf | 2019-06-13 |
| 23 | ABSTRACT 1.jpg | 2018-08-11 |
| 23 | 2141-MUM-2012-Response to office action (Mandatory) [18-06-2019(online)].pdf | 2019-06-18 |
| 1 | searchSTRATEGY_28-09-2018.pdf |