Abstract: A method of speeding execution of computer programs is disclosed. The method, in accordance with principles of the present invention, involves chronometric mapping of subject code for specific program elements, using data therefrom to initiate suitable strategies for enabling ahead of time execution of encoded modules.
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
PROVISIONAL SPECIFICATION
(See Section 10 and Rule 13)
Title of invention: METHOD FOR SPEEDING EXECUTION OF COMPUTER PROGRAMS
Applicant: KPIT Cummins Infosystems Limited
35 and 36, Rajiv Gandhi Infotech Park
Phase 1, MIDC, Hinjawadi
Pune-411 057
Maharashtra, India
The following specification describes the invention
FIELD OF THE INVENTION
This invention relates generally to the field of computer programs and more particularly, to methods and systems for non-sequentiai execution of program functions originally coded for sequential execution.
BACKGROUND OF THE INVENTION
The past few decades have been witness to the development of computer systems and their adaptive utility in various domains, The rapid pace of advances in software development and more particularly, the coding schematics used for writing of such computer programs have necessitated increase in processing capacities of the hardware used for execution of these computer programs. There have been significant efforts on two fronts for achieving these objectives. One has been in the development of faster as well as task-specific processors and the other, in the area of re-architecture of computer code for faster execution on available processors.
Parallel computing as a method to enable faster processing of computer programs has been the focus of recent innovations in the art. This has paved the way for the use of multiple processors and of late, processors with multiple processing elements . In this context, the concept of processor clusters as well as grids deserve mention since they can use processing capacities of local or remote computers for performing the task and multi-processor computers which have multiple processing elements within the same machine-Though concept of parallel computing seems advantageous and has seen rise in acceptability and popularity amongst software developers, a drawback with this concept becoming the mainstay of the field is that existing methods of programming are sequential and thus not directly suited to parallelising the code. Rewriting such program codes for parallel processing is generally tedious due to large size of code, expert manual efforts involved in re-writing the code, high cost of resources including time, emergence of new software bugs followed by rework and software testing as well as the problems of communication and sub-synchronization at the task level. In addition, identification and mapping of parallizable portions of the code and intelligent scheduling to different processor elements, remote or otherwise, and also the communication during such processing are still major hurdles towards achieving optimal speed-up of actual execution of computer programs.
Use of multicore processors scores over use of multiple processors in achieving the high processing capacities without requirements of high clock cycles, heat generation and power input. Thus, multicore processors have been increasingly popular in the field of parallel computing.
In general, with the rising popularity of multicore processors the need for related tools is also increasing. Application developers are in need of tools that could ease the programming for multicore processors. On the other hand, there is also an increased need for tools and methods to parallelize existing computer programs to take optimal advantage of parallel hardware. The present invention is
directed towards provision of a tool for automated adaptation of legacy code, existing sequential code and new code written in a sequential manner for conversion into code for parallel execution and hence, achieving faster execution on parallel processors. The present inventors have come up with novel methods and a system working thereupon for reducing execution times of computer programs by enabling effective parallelization of computer code thereof.
DESCRIPTION OF THE INVENTION
The present invention provides an automated method for parallelizing software source code. The source code is analyzed offline either by executing the application at least once to determine the parallelizable code or by manually inspecting it. The source code is then modified to generate a new code by inserting Special Purpose Instructions (SPI) to run code or pieces thereof ahead of time to achieve efficient usage of available parallelization hardware and to reduce the total time required for execution of the application. It would be clear to a person of ordinary skill in the art that the current invention is applicable to any parallel computing system and computer code.
The method of parallelization of computer code in accordance with the principles of the present invention involves inputting sequential code into an automatic parallization tool whereupon it is processed by a syntactic analyzer and offline profiler. Function of syntactic analyzer is to determine structure of the input code and thus identify positions to make additions to the syntax without affecting its functionality. Functions of the offline profiler includes data profiling which further comprises mapping of call graphs, chronometry and chronology of the functions in the input code. The analyzer then determines parallizability of functions depending on statements identified and causes the code generator to insert SPI to be added to the source code for generate output parallel source code, which is subsequently taken for execution in a parallel computing environment. Data profiling in accordance with this embodiment of the present invention involves plotting graph of variable Vs time for determining time required for definition and updation of data variables in the subject code. To render this graph, code is executed many times and value of variable and associated time for updation. Another graph which is rendered is Variable Vs Line number Graph which checks variable dependency based on the line number of code. This data is stored in a memory device such as a shared memory for subsequent retrieval.
The objective of preparing call graph and inserting SPI is to achieve initiation of execution of the code ahead of its actual time for initiation of execution in accordance with the execution pattern of original subject code. Another aspect of this exercise includes running redundant code if required (example of such are if-else and switch-case statements). This results in jumping of execution of non-sequential program elements ahead of their sequential execution times (SETs) based on analysis of data dependencies and records of statistics of updation of data variables. These further result in automatically finding dependency and timing information to identify and analyse timing of modules for parallizaiton get potential code for parallel execution.
Call graphs, as determined during the data profiling and analysis of input sequential code, illustrate graphically the fact that all functions follow a definite "call order" wherein the sequence is decided on the portion of the program logic contained within the source code and the dependency as in requirement of data variables updated via execution of earlier functions. The call chart is thus drawn by analysing this dependency of data variables and process flow of the program logic. The call graph illustrates graphically the sequence in which different functions are executed in the actual time-scale of execution of the subject sequential code. The objective of determining the call chart is to determine the minimum times required for initiating execution of individual functions within the subject code.
To determine absolute minimum wait time for execution of program element, it is essential to critically assess nature of program elements as well as data dependency statistics. This requires subject source code to be executed a number of times, data profiling done each time to identify progressive decrements in time required for initiation of execution of downstream program elements. This iterative methodology for chronometric assessments leads to determination of a minimum time prior to which it is impossible to initiate execution of downstream program elements, This is Absolute Minimum Wait Time (AMWT)
Once the modified code is generated, it must be operationally adapted to non-sequential execution in a parallel computing environment, said environments ranging from multiple processors in proximate or remote configurations to multicore processors. Further embodiments of the present invention that enable this objective shall be outlined in the subsequent description.
The analysis phase mentioned above also includes a Module Vs Time graph which gives a representation of the time taken by individual module. This analysis is critical for finding the target modules for parallelization. Modules which take most of the time are best suitable for parallelization. Determining the critical coordinates of time and line number within source code whereafter the value of a particular data variable does not change is instrumental in deciding the 'ahead of time' execution of the downstream functions. According to principles of the present invention, the steady state values of data variables at their respective critical points are stored in memory and recalled therefrom upon requirement by downstream functions. This enables evolution of logic for discrete execution of functions coded for sequential execution.
Another aspect of the present invention is the Iterative process to determine true dead times or true earliest times whereupon the corresponding functions may taken up for execution. The present inventors have come up with an iterative way of determining the true dead times of individual functions.
According to another aspect of the present invention wherein the parallel computing environment comprises multiple processors, the method of the present invention involves determining the optimal number of processors required for optimal parallelization of the code. Such an exercise involves
scheduling of each task or part thereof on different processors, thus indicating savings in pecuniary and chronometric terms.
It is a further object of the present invention to actually determine the number of processors required for achieving optimal speedup of the execution of software code. Based on dead time and wait time, the tool can also tell the optimal number of processing elements needed for the execution of any given sequential code. The optimal number depends is calculated by considering the fact that waiting time and dead time is minimal possible with maximal throughput of all the processing elements,
In furtherance to the elaborations outlined in the detailed description, reference is now made to some examples the purpose of which is only to explain the practical application of the present invention. Those skilled the art shall appreciate that the examples illustrated are in no way exhaustive and do not limit the scope of the present invention.
Although the present invention has been described in connection with one embodiment of the present invention as evident from the accompanying description, it is not limited thereto. It will be apparent to those skilled in the art that various substitutions, modifications and changes may be possible thereto without departing from the scope and spirit of the present invention.
Submitted on this 25th day of November 2008
| # | Name | Date |
|---|---|---|
| 1 | 2513-MUM-2008-ABSTRACT(2-3-2009).pdf | 2018-08-09 |
| 1 | 2513-MUM-2008-FORM 18(13-04-2009).pdf | 2009-04-13 |
| 2 | 2513-MUM-2008-CERTIFICATE OF INCORPORATION(17-1-2014).pdf | 2018-08-09 |
| 2 | 2513-MUM-2008_EXAMREPORT.pdf | 2018-08-09 |
| 3 | 2513-MUM-2008-WO INTERNATIONAL PUBLICATION REPORT(1-12-2008).pdf | 2018-08-09 |
| 3 | 2513-MUM-2008-CLAIMS(2-3-2009).pdf | 2018-08-09 |
| 4 | 2513-MUM-2008-OTHER DOCUMENT(13-8-2010).pdf | 2018-08-09 |
| 4 | 2513-MUM-2008-CORRESPONDENCE(1-6-2009).pdf | 2018-08-09 |
| 5 | 2513-MUM-2008-FORM PCT-ISA-210(13-8-2010).pdf | 2018-08-09 |
| 5 | 2513-MUM-2008-CORRESPONDENCE(13-4-2009).pdf | 2018-08-09 |
| 6 | 2513-MUM-2008-FORM 5(2-3-2009).pdf | 2018-08-09 |
| 6 | 2513-MUM-2008-CORRESPONDENCE(13-8-2010).pdf | 2018-08-09 |
| 7 | 2513-mum-2008-form 3.pdf | 2018-08-09 |
| 7 | 2513-MUM-2008-CORRESPONDENCE(2-3-2009).pdf | 2018-08-09 |
| 8 | 2513-MUM-2008-FORM 3(13-8-2010).pdf | 2018-08-09 |
| 8 | 2513-MUM-2008-CORRESPONDENCE(27-6-2012).pdf | 2018-08-09 |
| 9 | 2513-MUM-2008-CORRESPONDENCE(IPO)-(AB21)-(2-5-2016).pdf | 2018-08-09 |
| 9 | 2513-MUM-2008-FORM 3(13-4-2009).pdf | 2018-08-09 |
| 10 | 2513-MUM-2008-CORRESPONDENCE(IPO)-(FER)-(27-4-2015).pdf | 2018-08-09 |
| 10 | 2513-mum-2008-form 26.pdf | 2018-08-09 |
| 11 | 2513-mum-2008-correspondnce.pdf | 2018-08-09 |
| 11 | 2513-MUM-2008-FORM 26(27-6-2012).pdf | 2018-08-09 |
| 12 | 2513-MUM-2008-FORM 26(13-4-2009).pdf | 2018-08-09 |
| 13 | 2513-mum-2008-descriotion(provisional).pdf | 2018-08-09 |
| 13 | 2513-mum-2008-form 2.pdf | 2018-08-09 |
| 14 | 2513-MUM-2008-DESCRIPTION(COMPLETE)-(2-3-2009).pdf | 2018-08-09 |
| 15 | 2513-MUM-2008-DRAWING(2-3-2009).pdf | 2018-08-09 |
| 15 | 2513-mum-2008-form 2(title page).pdf | 2018-08-09 |
| 16 | 2513-MUM-2008-FORM 1(1-6-2009).pdf | 2018-08-09 |
| 16 | 2513-MUM-2008-FORM 2(TITLE PAGE)-(PROVISIONAL)-(1-12-2008).pdf | 2018-08-09 |
| 17 | 2513-MUM-2008-FORM 1(13-4-2009).pdf | 2018-08-09 |
| 17 | 2513-MUM-2008-FORM 2(TITLE PAGE)-(2-3-2009).pdf | 2018-08-09 |
| 18 | 2513-mum-2008-form 2(2-3-2009).pdf | 2018-08-09 |
| 18 | 2513-mum-2008-form 1.pdf | 2018-08-09 |
| 19 | 2513-MUM-2008-FORM 13(17-1-2014).pdf | 2018-08-09 |
| 20 | 2513-mum-2008-form 1.pdf | 2018-08-09 |
| 20 | 2513-mum-2008-form 2(2-3-2009).pdf | 2018-08-09 |
| 21 | 2513-MUM-2008-FORM 1(13-4-2009).pdf | 2018-08-09 |
| 21 | 2513-MUM-2008-FORM 2(TITLE PAGE)-(2-3-2009).pdf | 2018-08-09 |
| 22 | 2513-MUM-2008-FORM 1(1-6-2009).pdf | 2018-08-09 |
| 22 | 2513-MUM-2008-FORM 2(TITLE PAGE)-(PROVISIONAL)-(1-12-2008).pdf | 2018-08-09 |
| 23 | 2513-MUM-2008-DRAWING(2-3-2009).pdf | 2018-08-09 |
| 23 | 2513-mum-2008-form 2(title page).pdf | 2018-08-09 |
| 24 | 2513-MUM-2008-DESCRIPTION(COMPLETE)-(2-3-2009).pdf | 2018-08-09 |
| 25 | 2513-mum-2008-descriotion(provisional).pdf | 2018-08-09 |
| 25 | 2513-mum-2008-form 2.pdf | 2018-08-09 |
| 26 | 2513-MUM-2008-FORM 26(13-4-2009).pdf | 2018-08-09 |
| 27 | 2513-MUM-2008-FORM 26(27-6-2012).pdf | 2018-08-09 |
| 27 | 2513-mum-2008-correspondnce.pdf | 2018-08-09 |
| 28 | 2513-MUM-2008-CORRESPONDENCE(IPO)-(FER)-(27-4-2015).pdf | 2018-08-09 |
| 28 | 2513-mum-2008-form 26.pdf | 2018-08-09 |
| 29 | 2513-MUM-2008-CORRESPONDENCE(IPO)-(AB21)-(2-5-2016).pdf | 2018-08-09 |
| 29 | 2513-MUM-2008-FORM 3(13-4-2009).pdf | 2018-08-09 |
| 30 | 2513-MUM-2008-CORRESPONDENCE(27-6-2012).pdf | 2018-08-09 |
| 30 | 2513-MUM-2008-FORM 3(13-8-2010).pdf | 2018-08-09 |
| 31 | 2513-MUM-2008-CORRESPONDENCE(2-3-2009).pdf | 2018-08-09 |
| 31 | 2513-mum-2008-form 3.pdf | 2018-08-09 |
| 32 | 2513-MUM-2008-CORRESPONDENCE(13-8-2010).pdf | 2018-08-09 |
| 32 | 2513-MUM-2008-FORM 5(2-3-2009).pdf | 2018-08-09 |
| 33 | 2513-MUM-2008-FORM PCT-ISA-210(13-8-2010).pdf | 2018-08-09 |
| 33 | 2513-MUM-2008-CORRESPONDENCE(13-4-2009).pdf | 2018-08-09 |
| 34 | 2513-MUM-2008-OTHER DOCUMENT(13-8-2010).pdf | 2018-08-09 |
| 34 | 2513-MUM-2008-CORRESPONDENCE(1-6-2009).pdf | 2018-08-09 |
| 35 | 2513-MUM-2008-WO INTERNATIONAL PUBLICATION REPORT(1-12-2008).pdf | 2018-08-09 |
| 35 | 2513-MUM-2008-CLAIMS(2-3-2009).pdf | 2018-08-09 |
| 36 | 2513-MUM-2008_EXAMREPORT.pdf | 2018-08-09 |
| 36 | 2513-MUM-2008-CERTIFICATE OF INCORPORATION(17-1-2014).pdf | 2018-08-09 |
| 37 | 2513-MUM-2008-FORM 18(13-04-2009).pdf | 2009-04-13 |
| 37 | 2513-MUM-2008-ABSTRACT(2-3-2009).pdf | 2018-08-09 |