Abstract: A context-adaptive binary arithmetic coding method for decoding and encoding bit streams in a very long instruction word architecture (VLIW) is disclosed. The method of decoding bit streams is performed in seven cycles by utilizing a set of instructions which includes a look up table instruction for decoding a bin, a subtraction with status update instruction, a normalization instruction to normalize the input data, a bit extraction instruction to extract bits from a plurality of input registers and a get bits from bit stream buffer instruction. The method of encoding bit streams is performed in six cycles by utilizing a look up table instruction for encoding a bin, a re-normalization instruction to update the plurality of global variables during the re-normalization process and a reset instruction is performed in common to both the methods of decoding and encoding a bin to reset the instruction set.
4. DESCRIPTION:
Technical Field of the Invention
[001] The present invention generally relates to the field of encoding or decoding of a context adaptive binary arithmetic coding of a bin. More particularly the method of encoding and decoding of a bin is performed by executing multiple instructions in number of clock cycles.
Background of the Invention
[002] Generally, the recent video coding standard H.264/MPEG-4 AVC provides better compression efficiency than the previous video coding standards like MPEG-2, H.263, etc. The present video coding standard defines different profiles which specify the coding tools used to create a standard compliance bitstream. The main profile and High profile in H.264 standard can use either context-adaptive binary arithmetic coding (CABAC) or context-adaptive variable length coding (CAVLC) as a loss-less coding technique to form the final bitstream. CABAC offers better coding efficiency (around 10-12%) when compared to CAVLC, but at the expense of higher computational complexity.
[003] Conventionally, in any generic video coding standard which adopts CABAC to create a standard compliance bitstream, there exists a three stage CABAC decoding process which may include a first context modeling stage is categorized to decode the each individual bin of a binary string under a particular probability model and each probability model is represented by a pair of symbols (Mps, MpsState). The second binary arithmetic decoding stage decodes the value of a bin using the selected context model and then updates the context model and the third symbol decoding stage maps the decoded bins in the binary string of a particular coding symbol to find out the value of symbol like motion vector, macroblock type, residual coefficient, etc.
[004] Further conventionally, the video coding standard also includes a three elementary stage CABAC encoding process to encode a syntax element SE i.e., macroblock type, motion vectors, residual coefficients. The three elementary stages of CABAC encoder may includes a first binarization stage to map the syntax element SE to unique binary string, a second context modeling stage is categorized to encode an each bin of binary string under a particular probability model where each probability model is represented by pair of symbols (Mps, MpsState) dependent on the past encoded symbol information and the third binary arithmetic encoding stage encodes the bin by using the selected context model and updates the context model.
[005] Typically, to encode/decode each bin takes on an around of 30 to 40 cycles on any normal DSP. In general, the ratio between the number of bins and the number of bits in a macroblock (MB) is 1.5:1. Based on the above points, the DSP needs to spend nearly 4500 cycles/MB for encoding/decoding a 4 Mbps CABAC stream of 720x480 resolution video sequence at 30 frames/second.
[006] In the light of aforementioned discussion there exists a need of improving a CABAC encoding/decoding speed on any DSP by providing some specialized instructions and also there is a need of a specialized algorithms to implement the basic bin encoding/decoding and context modeling.
Brief Summary of the Invention
[007] The following presents a simplified summary of the disclosure in order to provide a basic understanding to the reader. This summary is not an extensive overview of the disclosure and it does not identify key/critical elements of the invention or delineate the scope of the invention. Its sole purpose is to present some concepts disclosed herein in a simplified form as a prelude to the more detailed description that is presented later.
[008] A more complete appreciation of the present invention and the scope thereof can be obtained from the accompanying drawings which are briefly summarized below and the following detailed description of the presently preferred embodiments.
[009] Exemplary embodiments of the present invention discloses a context-adaptive binary arithmetic coding method for decoding bit streams in a very long instruction word architecture (VLIW). According to a first aspect, the method includes resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address through a reset instruction. The plurality of input registers assigned with a reset flag and a bit stream pointer address for decoding a context-adaptive binary arithmetic coding bit streams.
[0010] According to the first aspect, the method includes decoding a bin through a specified lookup table instruction by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table in a first cycle.
[0011] According to the first aspect, the method includes subtracting a plurality of first set of operands assigned to the plurality of input registers through a subtraction instruction in a second cycle. The normalization instruction executing in parallel to the subtraction instruction assigns a plurality of signed data bits to the plurality of input registers for analysing the normalized data and the number of left shifts executed by the signed data bits through a plurality of arithmetic and logical units.
[0012] According to the first aspect, the method includes subtracting a plurality of second set of operands assigned to input registers by updating a plurality of status flags through a subtraction with status update instruction in a third cycle. The normalization instruction in parallel to the subtraction with status update instruction assigns the plurality of signed data bits to the plurality of input registers for analysing the normalized data and the number of left shifts executed by the signed data bits through a plurality of arithmetic and logical units.
[0013] According to the first aspect, the method includes determining if a value of a first operand associated with the plurality of input registers is greater than or equal to a value of a second operand associated with a plurality of input registers for equalizing the values of operands associated with the plurality of input registers and if the value of the first operand is less than the value of the second operand to not perform an action. The step of determining the value of operands performed in a fourth cycle and a fifth cycle.
[0014] According to the first aspect, the method includes extracting bits from a bit stream buffer and updating a bit stream pointer through a get bits from a bit stream buffer instruction in a sixth cycle by assigning an input register with a pointer to the bit stream buffer and by accessing the number of bits from the bit stream.
[0015] According to the first aspect, the method includes extracting a particular bit from the plurality of input registers to store on the least significant bit side of a register by assigning an input data and a constant value to the plurality of input registers. The constant value includes the position of the bit and the number of bits to be extracted at that specified position through a bit extraction instruction in a seventh cycle.
[0016] According to a second aspect, the method for encoding bit streams in a very long instruction word architecture includes resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address through a reset instruction by assigning a plurality of input registers with a reset flag and a bit stream pointer address to encode the context-adaptive binary arithmetic coding bit streams.
[0017] According to the second aspect, the method for encoding bit streams in a very long instruction word architecture includes encoding a bin through a specified lookup table instruction by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table in a first cycle.
[0018] According to a second aspect, the method for encoding bit streams in a very long instruction word architecture includes subtracting a plurality of first set of operands assigned to input registers through a subtraction instruction in a second cycle. The bit extraction instruction executing in parallel to the subtraction instruction extracts a particular bit from the plurality of input register by assigning an input data and a constant value to the plurality of input registers including a position of the bit and a number of bits to be extracted at that specified position through a plurality of arithmetic and logical units.
[0019] According to a second aspect, the method for encoding bit streams in a very long instruction word architecture includes adding a plurality of second set of operands assigned to the plurality of input registers through an addition instruction in a third cycle. The specified comparison instruction used in parallel to the addition instruction compares the values assigned to a plurality of input registers and updates the plurality of status flags.
[0020] According to a second aspect, the method for encoding bit streams in a very long instruction word architecture includes determining a value of a first operand associated with a plurality of input registers is equal or not with a value of a second operand associated with a plurality of input registers by considering the information updated in the plurality of status flags for choosing the next instruction path to be performed in fourth cycle and fifth cycle. If the value of the first operand associated with the plurality of input registers is equal to the value of the second operand associated with the plurality of input registers to not perform an action by utilizing an equal status flag.
[0021] According to a second aspect, the method for encoding bit streams in a very long instruction word architecture includes renormalizing a context-adaptive binary arithmetic coding for encoding a bin by assigning the plurality of input registers with a plurality of predefined global variables and transmit the generated bits to a bit stream buffer through a re-normalization instruction in a sixth cycle.
[0022] According to a third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of global variables assigned at the beginning of decoding each slice to improve a performance of an instruction set executed by a context-adaptive binary arithmetic coding decoder.
[0023] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set.
[0024] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a reset instruction configured to perform an operation of resetting a context adaptive binary arithmetic coding engine and updating a bit stream buffer address by assigning the plurality of input registers with a reset flag and a bit stream pointer address to decode a context- adaptive binary arithmetic coding bit stream.
[0025] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a lookup table instruction configured to decode a bin by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state using a transition table.
[0026] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a subtraction with status update instruction configured to perform an operation of subtraction among a plurality of first set of operands assigned to the plurality of input registers and to update the plurality of status flags.
[0027] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a normalization instruction configured to count the number of signed bits included in a source register and left-shifts the value of the source register to store the normalized value and number of sign bits in destination registers.
[0028] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a bit extracting instruction for extracting a particular bit from the plurality of input registers configured to perform a respective operation by considering the position of the bit and the number of bits to be extracted at that specified position.
[0029] According to the third aspect, the system for decoding a bit stream in a very long instruction word architecture includes a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a get bits from the bit stream buffer instruction configured to extract bits from a bit stream buffer and update a bit stream pointer by assigning the plurality of input registers with a pointer to the bit stream buffer and access the number of bits from the bit stream.
[0030] According to a fourth aspect, the system for encoding a bit stream in a very long instruction word architecture includes a plurality of global variables assigned at the beginning of encoding each slice to improve a performance of an instruction set executed by a context-adaptive binary arithmetic coding encoder.
[0031 ] According to the fourth aspect, the system for encoding a bit stream in a very long instruction word architecture includes a plurality of parameters configured in a context model of at least one symbol bin are combined into a single parameter to initialize at the beginning of encoding each slice for performing the operation specified by a predetermined instruction set.
[0032] According to the fourth aspect, the system for encoding a bit stream in a very long instruction word architecture includes a plurality of parameters configured in a context model of at least one symbol bin are combined into a single parameter to initialize at the beginning of encoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a reset instruction configured to perform an operation of resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address by assigning the plurality of input registers with a reset flag and a bit stream pointer address to encode a context-adaptive binary arithmetic coding bit stream.
[0033] According to the fourth aspect, the system for encoding a bit stream in a very long instruction word architecture includes a plurality of parameters configured in a context model of at least one symbol bin are combined into a single parameter to initialize at the beginning of encoding each slice for performing the operation specified by a predetermined instruction set.
The predefined instruction set comprising a lookup table instruction configured to encode a bin by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table.
[0034] According to the fourth aspect, the system for encoding a bit stream in a very long instruction word architecture includes a plurality of parameters configured in a context model of at least one symbol bin are combined into a single parameter to initialize at the beginning of encoding each slice for performing the operation specified by a predetermined instruction set. The predefined instruction set comprising a renormalizing instruction configured to perform an operation of renormalizing a context-adaptive binary arithmetic coding encoder for encoding a bin by assigning the plurality of input registers with a plurality of predefined global variables and to transmit the generated bits to a bit stream buffer during a re-normalization process.
Brief Description of the Drawings
[0035] The above-mentioned and other features and advantages of this present disclosure, and the manner of attaining them, will become more apparent and the present disclosure will be better understood by reference to the following description of embodiments of the present disclosure taken in conjunction with the accompanying drawings, wherein:
[0036] FIG. 1 is a block diagram depicting about a very long instruction word (VLIW) architecture.
[0037] FIG. 2 is a flow diagram depicting a method employed for decoding a bin using an arithmetic decoding process.
[0038] FIG. 3 is a flow diagram depicting a method employed for decoding a bin in a minimum number of clock cycles using an arithmetic decoding process.
[0039] FIG. 4 is a flow diagram depicting a method employed for encoding a bin using an arithmetic encoding process.
[0040] FIG. 5 is a flow diagram depicting a method employed for encoding a bin in a minimum number of clock cycles using an arithmetic encoding process.
Detail description of the invention
[0041] It is to be understood that the present disclosure is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The present disclosure is capable of other embodiments and of being practised or of being carried out in various ways. Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting.
[0042] The use of "including", "comprising" or "having" and variations thereof herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. The terms "a" and "an" herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced item. Further, the use of terms "first", "second", and "third", and the like, herein do not denote any order, quantity, or importance, but rather are used to distinguish one element from another.
[0043] Referring to FIG. 1 is a block diagram 100 depicting about a very long instruction word (VLIW) architecture. According to a non limiting exemplary embodiments of the present invention, the system includes a program memory 102 connected to a processor 104 including a program counter 106, register set 108 and an arithmetic and logical unit-zero 110a and an arithmetic and logical unit-one II 0b and further a bit stream buffer 116, a data memory 118, a range table least probable symbol read only memory 112 and a transition table read only memory 114 are in communication with the processor 104.
[0044] In accordance with a non limiting exemplary embodiment of the present invention, the program memory 102 is used to initially program the instruction sets to be executed by the processor 104. The processor 104 including a program counter 106 is used to locate the start of execution of an instruction and also used to increase the.number of cycles from cycle 0 to cycle 7 for decoding an instruction and from cycle 0 to cycle 6 for encoding an instruction. The register set 108 included in the processor 104 is used to store the multiple registers for supporting the architecture by performing the respective instruction set assigned in the register and the arithmetic and logical unit-zero 110a and the arithmetic and logical unit-one 110b included in the processor 104 are used to execute the instructions in parallel.
[0045] According to a non limiting exemplary embodiments of the present invention, the range table least probable symbol read only memory 112 in communication with the processor 104 having a constant memory executes the least probable symbol range table according to the specified instructions executed in the arithmetic and logical unit zero 110a or the arithmetic and logical unit one 11 Ob. For example the range table format can be in the form of:
[0046] Similarly, the transition table read only memory 114 which is also in communication with processor 104 have a constant memory to execute the transition table according to the specified instructions such as lookup table for decoding and encoding the bit stream by executing the arithmetic and logical unit zero 110a or the arithmetic and logical unit one 110b. For example the transition table format can be in the form of:TransTable[128] = OxcOOl, 0x8002, 0x8103, 0x8204, 0x8205, 0x8406, 0x8407, 0x8508, 0x8609, 0x870a, 0x880b, 0x890c 0x63f6, 0x63f7,0x64f8, 0x64f9, 0x64fa,
0x65fb, 0x65fc, 0x65fd, 0x66fe, 0x66fe, 0x7fff.
[0047] In accordance with a non limiting exemplary embodiment of the present invention, the system further includes a bit stream buffer 116 communicating with a processor 104 is used to support the bit stream data executed by the arithmetic and logical unit-one 110b for mapping the data into a meaning full symbol in case of a decoder. Similarly in case of encoder the symbols are encoded into bits to execute multiple instructions. Further, the data memory 118 in communication with the processor 104 having the constant memory supports the instructions programmed in the program memory 102 by the available local variables.
The symbols formed by decoding the bit stream data through the bit stream buffer 116 are transmitted to the data memory 118 and further the symbol data available in the data memory 118 is transmitted to the bit stream buffer 116 after encoding the symbol data.
[0048] Referring to FIG. 2 is a flow diagram 200 depicting a method employed for decoding a bin using an arithmetic decoding process. According to a non limiting exemplary embodiment of the present invention, the method of decoding a bin starts at step 202 by setting the range, offset and most probable symbol state values. At step 204 the values of a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state are calculated through a look up table instruction with the inputs of predetermined range and most probable symbol state values. Further at step 206 the range value is being calculated by subtracting the initially set range value with the range of a least probable symbol through a subtraction instruction.
[0049] In accordance with a non limiting exemplary embodiment of the present invention, next at step 208 the subtraction with status update instruction is provided with a condition by assigning the subtracted offset and range values to "off' for verifying that whether the off value is greater than or equal to zero or not. Thus if the "off value is found to be greater than or equal to zero in the step 208 the range of least probable symbol is assigned to range, "off value to offset and modified least probable symbol state is assigned to modified most probable symbol state at step 210 and continues with the step 212. If suppose the "off value is not found to be greater than or equal to zero in the step 208 the step directly continues with the step 212 for calculating the normalization of range value using the normalization instruction and assigns the normalized value to the range and bits of the output registers. Along with the normalization instruction a bit extraction instruction is also executed in parallel at step 212 to calculate the inputs associated with the modified most probable symbol state and the output obtained by extracting a particular bit from the plurality of input registers assigned to a bin.
[0050] According to a non limiting exemplary embodiment of the present invention, next at step 214 the offset value is left shifted by far less than or equal to bits and the get bits from the bit stream buffer instruction is used to extract bits from a bit stream buffer by assigning the input registers with a stream pointer to the bit stream buffer and by accessing the number of bits from the bit stream.
The bits extracted from the bit stream buffer are allocated to a string. Further at step 216 the offset value is incremented by the string value obtained in the step 214 and after decoding a bin the modified most probable symbol state is assigned to the most probable symbol state.
[0051] Referring to FIG. 3 is a flow diagram 300 depicting a method employed for decoding a bin in a minimum number of clock cycles using an arithmetic decoding process. According to a non limiting exemplary embodiment of the present invention, the method of decoding a bin in minimum number of clock cycles starts at step 302 by assigning the pre-determinedly set range value to register R0, pre-determinedly set offset value to the register Rl and pre-determinedly set most probable symbol state value to the register R2. At step 304 the look up table instruction is executed to calculate the range of least probable symbol value assigned to R3, modified most probable symbol state value assigned to R4 and a modified least probable symbol state value assigned to R5 in a complete first cycle using the plurality of input registers such as most probable symbol state R2 and range R0 through a transition table.
[0052] In accordance with a non limiting exemplary embodiments of the present invention, at step 306 the subtraction instruction and normalization instruction are executed in parallel to complete in a second cycle because of the two arithmetic and logical units associated in the processor.
The subtraction instruction is executed between the plurality of first set of operands assigned to the plurality of input registers R0 and R3 and allocates the subtracted value to the output register R0. Similarly at step 306 the normalization instruction is also executed in parallel to the subtraction instruction to analyse the normalized data R3 and access the number of left-shifts executed by the signed data bits R6 assigned to the plurality of input register R3. Next at step 308 the subtraction with status update instruction and normalization instruction are executed in parallel to complete in a third cycle because of the two arithmetic and logical units associated in the processor.
The subtraction with status update instruction is executed between the plurality of second set of operands by assigning the plurality of second set of operands to the plurality of input registers Rl and R0 and updates the multiple status flags by providing an output result of subtraction to the output register R7. Similarly at step 308 the normalization instruction is also executed in parallel to the subtraction with status update instruction to analyse normalized data R0 and access the number of left-shifts executed by the signed data bits R8 by assigning the plurality of signed data bits to the input register R0.
[0053] According to a non limiting exemplary embodiment of the present invention, at step 310 a status flag (greater than or equal flag), which is updated in step 308 by subtraction instruction, is analysed whether it is ON or OFF. Thus if the status flag value is found to be ON the values of plurality of registers are made equal at step 312 such as R0 is equal to R3, Rl is equal to R7, R4 is equal to R5 and R8 is equal to R6 and continues with the step 316. Similarly if the status flag value is not found to be ON then any operation is not performed at step 314 and further continues with the step 316. Thus the method of determining the condition executed in step 310, 312 and 314 is completed in two cycles cycle 4 and cycle 5.
[0054] In accordance with a non limiting exemplary embodiment of the present invention, at step 316 the offset register Rl is assigned with the left shifted value of the register Rl, here the value of register Rl is left shifted by number of bits whose value is equal to the value of the register R8 and get bits from bit stream buffer instruction is executed to extract bits from a bit stream buffer stored in an output register R9 and update a bit stream pointer in the output register RIO by assigning the input register RIO with a pointer to the bit stream buffer and by accessing the number of bits from the bit stream assigned to the input register R8 to be completed in a cycle 6.
Next at step 318 the value of register R9 is left-shifted by six bits and added with the value of register Rl to allocate the added value to the register Rl and execute a bit extraction instruction to extract a particular bit from the input register R4 to store on the least significant bit side of a register Rll by assigning an input data to the input register R4.
The constant value provided in the opcode of instruction includes the position of the bit and the number of bits to be extracted at that specified position through a bit extraction instruction in a seventh cycle. Further at step 320 the output obtained by the register R0 is assigned to range value, the value of the register Rl to the offset, the value of register R4 to the most probable symbol state and the value of register Rl 1 to the bin.
[0055] Referring to FIG. 4 is a flow diagram 400 depicting a method employed for encoding a bin using an arithmetic encoding process. According to a non limiting exemplary embodiment of the present invention, the method of encoding a bin starts at step 402 by setting the range value, low value and most probable symbol state value.
At step 404 the values of a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state are calculated through a look up table instruction with the inputs of predetermined range value and most probable symbol state value. Further at step 406 the range value is being calculated by subtracting the initially set range value with the range of a least probable symbol through a subtraction instruction and most probable symbol value is calculated by right-shifting the most probable symbol state value by six bits and then considering the least significant bit.
[0056] In accordance with a non limiting exemplary embodiment of the present invention, next at step 408 a condition is being provided for verifying that whether the value of bin is not equal to the value of most probable symbol or not. Thus if the bin value is found to be not equal to the most probable symbol value then the value obtained by adding low value and the range value is assigned to the low, range of least probable symbol is assigned to the range and modified least probable symbol state is assigned to the modified most probable symbol state at step 410 and continues with the step 412. If suppose the bin value is found to be equal to the most probable symbol value then the method directly continues with the step 412 by executing the re-normalization instruction by assigning the plurality of input registers with a plurality of predefined global variables comprising the range value and low value and transmit the generated bits to a bit stream buffer and stores the updated range and updated low values in the output range and low registers. Further at step 414 after completing the method of encoding a bin the modified most probable symbol state value is assigned to a most probable symbol state.
[0057] Referring to FIG. 5 is a flow diagram 500 depicting a method employed for encoding a bin in a minimum number of clock cycles using an arithmetic encoding process. According to a non limiting exemplary embodiment of the present invention, the method of encoding a bin in minimum number of clock cycles starts at step 502 by assigning the pre-determinedly set range value to register R0, pre-determinedly set low value to the register Rl, pre-determinedly set most probable symbol state value to the register R2 and pre-determinedly set bin value to the register R3. At step 504 the look up table instruction is executed to calculate the range of least probable symbol value assigned to R5, modified most probable symbol state value assigned to R6 and a modified least probable symbol state value assigned to R7 in a complete first cycle using the plurality of input registers such as most probable symbol state R2 and range R0 through a transition table.
[0058] In accordance with a non limiting exemplary embodiment of the present invention, at step 506 the subtraction instruction and bit extraction instruction are executed in parallel to complete in a second cycle because of the two arithmetic and logical units associated in the processor.
The subtraction instruction is executed between the plurality of first set of operands by assigning the plurality of first set of operands to the plurality of input registers R0 and R5 and allocates the subtracted value to the output register R0. Similarly at step 506 the bit extraction instruction is executed in parallel to the subtraction instruction to extract a particular bit from the input register by assigning an input data allocated to the register R2 and a constant value including a position of the bit and a number of bits to be extracted at that specified position are considered to extract certain number of bits and the extracted bits are stored in the register R4.
Next at step 508 the addition instruction and comparison instruction are executed in parallel to complete in a third cycle because of the two arithmetic and logical units associated in the processor.
The addition instruction is executed to add the plurality of two set of operands assigned to the plurality of input registers Rl and RO and stores the result obtained by adding the plurality of input registers Rl and RO in a register R4. Similarly the comparison instruction is used to compare the values assigned by the plurality of input registers R3 and R4 and updates the multiple status flags.
[0059] According to a non limiting exemplary embodiment of the present invention, next at step 510 condition is provided for verifying that whether the plurality of input registers R4 and R3 assigned to the comparison instruction in step 508 are found to be equal or not based on the information updated in the previous status flag at step 508. Thus if the values of the plurality of input registers R4 and R3 are found to be not equal at step 508 then the values of the register R5 is assigned to the register R0, values of the register R4 is assigned to the register Rl, values of register R7 is assigned to the register R6 and does not performs any operation after assigning the values to their required registers at step 512 and further continues with the step 516.
Similarly if the values assigned to the plurality of input registers R4 and R3 are found to be equal at step 508 then the method does not perform any operation at step 514 and continues with the step 516. Thus the method of determining the condition executed in step 510, 512 and 514 is completed in two cycles cycle 4 and cycle 5.
[0060] In accordance with a non limiting exemplary embodiment of the present invention, at step 516a re-normalization instruction is executed by assigning the plurality of input registers R0 and Rl with a plurality of predefined global variables including range value and low value and provides an output to the plurality of registers R0 and Rl with the updated range value and low value and further transmits the bits generated during the re-normalization process to a bit stream buffer in a cycle 6. Next at step 518 after encoding a bin in minimum number of clock cycles the values of register R0 is assigned to range, the values of the register Rl to low and the values of register R6 to the most probable symbol state.
[0061] While specific embodiments of the invention have been shown and described in detail to illustrate the inventive principles, it will be understood that the invention may be embodied otherwise without departing from such principles.
5. Claims:
What is claimed is:
1. A context-adaptive binary arithmetic coding method for decoding bit streams in a very long instruction word architecture (VLIW), the method comprising: resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address through a reset instruction, whereby a plurality of input registers assigned with a reset flag and a bit stream pointer address for decoding a context-adaptive binary arithmetic coding bit streams; decoding a bin through a specified lookup table instruction by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol,
a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table in a first cycle; subtracting a plurality of first set of operands assigned to the plurality of input registers through a subtraction instruction in a second cycle, whereby a normalization instruction executing in parallel to the subtraction instruction assigns a plurality of signed data bits to the plurality of input registers for analysing the normalized data and the number of left shifts executed by the signed data bits through a plurality of arithmetic and logical units; subtracting a plurality of second set of operands assigned to input registers by updating a plurality of status flags through a subtraction with status update instruction in a third cycle, whereby a normalization instruction in parallel to the subtraction with status update instruction assigns the plurality of signed data bits to the plurality of input registers for analysing the normalized data and the number of left shifts executed by the signed data bits through a plurality of arithmetic and logical units; determining if a value of a first operand associated with the plurality of input registers is greater than or equal to a value of a second operand associated with a
plurality of input registers for equalizing the values of operands associated with the plurality of input registers and if the value of the first operand is less than the value of the second operand to not perform an action, whereby the step of determining the value of operands performed in a fourth cycle and a fifth cycle; extracting bits from a bit stream buffer and updating a bit stream pointer through a get bits from a bit stream buffer instruction in a sixth cycle by assigning an input register with a pointer to the bit stream buffer and by accessing the number of bits from the bit stream; and extracting a particular bit from the plurality of input registers to store on the least significant bit side of a register by assigning an input data and a constant value to the plurality of input registers, whereby the constant value includes the position of the bit and the number of bits to be extracted at that specified position through a bit extraction instruction in a seventh cycle.
2. The method of claim 1 further comprises a step of decoding a context-adaptive binary arithmetic coding bin in a minimum number of clock cycles.
3. The method of claim 1 further comprises a step of configuring a plurality of arithmetic and logical units for decoding a bit stream by parallel execution of the instructions.
4. A context-adaptive binary arithmetic coding method for encoding bit streams in a very long instruction word architecture, the method comprising: resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address through a reset instruction by assigning a plurality of input registers with a reset flag and a bit stream pointer address to encode the context-adaptive binary arithmetic coding bit streams; encoding a bin through a specified lookup table instruction by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table in a first cycle; subtracting a plurality of first set of operands assigned to input registers through a subtraction instruction in a second cycle, whereby a bit extraction instruction executing in parallel to the subtraction instruction extracts a particular bit from the plurality of input register by assigning an input data and a constant value to the plurality of input registers including a position of the bit and a number of bits to be extracted at that specified position through a plurality of arithmetic and logical units;
adding a plurality of second set of operands assigned to the plurality of input registers through an addition instruction in a third cycle, whereby a specified comparison instruction used in parallel to the addition instruction compares the values assigned to a plurality of input registers and updates the plurality of status flags; determining a value of a first operand associated with a plurality of input registers is equal or not with a value of a second operand associated with a plurality of input registers by considering the information updated in the plurality of status flags for choosing the next instruction path to be performed in fourth cycle and fifth cycle, whereby if the value of the first operand associated with the plurality of input registers is equal to the value of the second operand associated with the plurality of input registers to not perform an action by utilizing an equal status flag; and renormalizing a context-adaptive binary arithmetic coding for encoding a bin by assigning the plurality of input registers with a plurality of predefined global variables and transmit the generated bits to a bit stream buffer through a re-normalization instruction in a sixth cycle.
5. The method of claim 4 further comprises a step of encoding a context-adaptive binary arithmetic coding bin in a minimum number of clock cycles.
6. The method of claim 4 further comprises a step of configuring the plurality of arithmetic and logical units for encoding a bit stream by parallel execution of the instructions.
7. A context-adaptive binary arithmetic coding decoder instructions for decoding a bit stream in a very long instruction word architecture comprising:
a plurality of global variables assigned at the beginning of decoding each slice to improve a performance of an instruction set executed by a context-adaptive binary arithmetic coding decoder;
a plurality of parameters associated to a context model comprising of at least one symbol bin combined into a single parameter to initialize at the beginning of decoding each slice for performing the operation specified by a predetermined instruction set, whereby the predefined instruction set comprising:
a reset instruction configured to perform an operation of resetting a context adaptive binary arithmetic coding engine and updating a bit stream buffer address by assigning the plurality of input registers with a reset flag and a bit stream pointer address to decode a context- adaptive binary arithmetic coding bit stream;
a lookup table instruction configured to decode a bin by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state using a transition table;
a subtraction with status update instruction configured to perform an operation of subtraction among a plurality of first set of operands assigned to the plurality of input registers and to update the plurality of status flags;
a normalization instruction configured to count the number of signed bits included in a source register and left-shifts the value of the source register to store the normalized value and number of sign bits in destination registers;
a bit extracting instruction for extracting a particular bit from the plurality of input registers configured to perform a respective operation by considering the position of the bit and the number of bits to be extracted at that specified position; and
a get bits from the bit stream buffer instruction configured to extract bits from a bit stream buffer and update a bit stream pointer by assigning the plurality of input registers with a pointer to the bit stream buffer and access the number of bits from the bit stream.
8. The context-adaptive binary arithmetic coding decoder instructions of claim 7, wherein a plurality of parallel execution units configured to support at least one general instruction and an at least one context-adaptive binary arithmetic coding decoder specific instruction to decode a bit stream.
9. The context-adaptive binary arithmetic coding decoder instruction of claim 7, wherein a plurality of global variables assigned at the beginning of each slice includes a predetermined range value and a predetermined offset value.
10. The context-adaptive binary arithmetic coding decoder instruction of claim 7, wherein a plurality of parameters included in a context model comprising a most probable symbol and a most probable symbol state.
11. A context-adaptive binary arithmetic coding encoder instruction for encoding a bit stream in a very long instruction word architecture comprising:
a plurality of global variables assigned at the beginning of encoding each slice to improve a performance of an instruction set executed by a context- adaptive binary arithmetic coding encoder;
a plurality of parameters configured in a context model of at least one symbol bin are combined into a single parameter to initialize at the beginning of encoding each slice for performing the operation specified by a predetermined instruction set, whereby the predefined instruction set comprising:
a reset instruction configured to perform an operation of resetting a context-adaptive binary arithmetic coding engine and updating a bit stream buffer address by assigning the plurality of input registers with a reset flag and a bit stream pointer address to encode a context-adaptive binary arithmetic coding bit stream;
a lookup table instruction configured to encode a bin by assigning the plurality of input registers with a plurality of predefined context model parameters and a predefined global variable for finding a range of a least probable symbol, a modified most probable symbol state and a modified least probable symbol state by utilizing a transition table; and
a renormalizing instruction configured to perform an operation of renormalizing a context-adaptive binary arithmetic coding encoder for encoding a bin by assigning the plurality of input registers with a predefined plurality of global variables and transmit the generated bits to a bit stream buffer during a re-normalization process.
12. The context-adaptive binary arithmetic coding encoder instruction of claim 11, wherein a plurality of parallel execution units configured to support an at least one general instruction and an at least one context-adaptive binary arithmetic coding encoder specific instruction to encode a bit stream.
| # | Name | Date |
|---|---|---|
| 1 | 3669-CHE-2012 FORM-5 05-09-2012.pdf | 2012-09-05 |
| 2 | 3669-CHE-2012 FORM-3 05-09-2012.pdf | 2012-09-05 |
| 3 | 3669-CHE-2012 FORM-2 05-09-2012.pdf | 2012-09-05 |
| 4 | 3669-CHE-2012 FORM-1 05-09-2012.pdf | 2012-09-05 |
| 5 | 3669-CHE-2012 DRAWINGS 05-09-2012.pdf | 2012-09-05 |
| 6 | 3669-CHE-2012 DESCRIPTION (COMPLETE) 05-09-2012.pdf | 2012-09-05 |
| 7 | 3669-CHE-2012 CORRESPONDENCE OTHERS 05-09-2012.pdf | 2012-09-05 |
| 8 | 3669-CHE-2012 CLAIMS 05-09-2012.pdf | 2012-09-05 |
| 9 | 3669-CHE-2012 ABSTRACT 05-09-2012.pdf | 2012-09-05 |