Methods And Systems For Automated Design Of Materials And Its

< Back

Methods And Systems For Automated Design Of Materials And Its Manufacturing Process For Desired Properties

Abstract: ABSTRACT METHODS AND SYSTEMS FOR AUTOMATED DESIGN OF MATERIALS AND ITS MANUFACTURING PROCESS FOR DESIRED PROPERTIES The disclosure relates generally to methods and systems for automated design of materials and the manufacturing process for desired properties. Conventional automated materials design techniques do not perform an integrated design of (i) a material composition and (ii) their manufacturing processing steps. The present disclosure addresses this gap by using a multi-agent setup for automated design, wherein a distinct Reinforcement learning (RL) agent is used to mirror the composition selection (CS) and various sequential manufacturing process steps (PS) involved in its manufacturing route. The distinct RL agents learn from both past design data and computational models (empirical/analytical/physics-based models) representing the design process. The present disclosure also integrates other important parameters such as manufacturability, ESG norms, cost, process energy etc. and their relative importance into the design decision making process of the RL agents by expressing them as reward components upon which the RL agents are trained. [To be published with FIG. 2]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

07 March 2024

Publication Number

37/2025

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th floor, Nariman point, Mumbai 400021, Maharashtra, India

Inventors

1. MUHAMMED, Bilal

Tata Consultancy Services Limited, Peepul Park, Technopark Campus, Kariyavattom P.O. Trivandrum 695581, Kerala, India

2. BHATTACHARJEE, Akash

Tata Consultancy Services Limited, Quadra II (Second Floor), Survey No 238/239 (Opposite Magarpatta City), Hadapsar, Pune 411028, Maharashtra, India

3. BASAVARSU, Purushottham Gautham

Tata Consultancy Services Limited, Plot No. 2 & 3, MIDC-SEZ, Rajiv Gandhi Infotech Park, Hinjewadi Phase III, Pune 411057, Maharashtra, India

4. TENNYSON, Gerald

Tata Consultancy Services Limited, Plot No. 2 & 3, MIDC-SEZ, Rajiv Gandhi Infotech Park, Hinjewadi Phase III, Pune 411057, Maharashtra, India

5. JOSHI, Amol Dilip

Tata Consultancy Services Limited, Plot No. 2 & 3, MIDC-SEZ, Rajiv Gandhi Infotech Park, Hinjewadi Phase III, Pune 411057, Maharashtra, India

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
METHODS AND SYSTEMS FOR AUTOMATED DESIGN OF MATERIALS AND ITS MANUFACTURING PROCESS FOR DESIRED PROPERTIES
Applicant
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description:
The following specification particularly describes the invention and the manner in which it is to be performed.
2
TECHNICAL FIELD
[001]
The disclosure herein generally relates to the field of material design, and more specifically to methods and systems for automated design of materials and its manufacturing process for desired properties.
5
BACKGROUND
[002]
Material design and development is a complex, costly, and time-consuming process with a high resource requirement. Materials are designed for a specific application which requires a set of target properties. The final properties of the materials are determined by the composition and the structure obtained via the 10 processing of the material through a manufacturing route. There are multiple manufacturing process routes available in the art to develop the material. Each manufacturing process route often involves multiple sequential manufacturing process steps. For example, a hot rolled steel sheet is produced through two different manufacturing process routes viz., a conventional casting route or a thin 15 slab casting route depending on the final property requirements. Along with the target properties of the material, other parameters such as a cost, a manufacturability, constraints arriving from environment, social, & governance (ESG) norms, production capabilities, and so on are taken into consideration during the production to meet optimal design. 20
[003]
Conventional techniques for the materials design involve extensive experimental trials and utilizing experience of the designers, and processing-structure-property relations known. Most of the conventional automated materials design techniques either focus on correlating between composition of the materials and their properties or uses a process model to corelate between the process 25 parameters and the properties of the material at a single process level but does not take into consideration the entire process chain for manufacturing of the materials that determine the underlying structure evolution and thereby the properties of the developed material. The conventional automated materials design techniques lack an integrated design of (i) a material composition and (ii) their manufacturing 30 processing steps.
3
SUMMARY
[004]
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[005]
In an aspect, a processor-implemented method for automated design 5 of materials and its manufacturing process for desired properties is provided. The method including the steps of: receiving a historical design data associated to a plurality of materials, from a repository; creating a training dataset of each material of the plurality of materials, from the historical design data, to obtain a plurality of training data sets associated with the plurality of materials, wherein the training 10 data set of each of the plurality of materials comprises (i) an achieved value of each of one or more properties of the material, (ii) a value of each of one or more composition elements present in the material, (iii) a value of each of one or more process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the 15 material; receiving a target value of each of the one or more properties of each material of the plurality of materials; training a set of reinforcement learning (RL) models, using the plurality of training data sets and the target value of each of the one or more properties of each material of the plurality of materials to obtain a set of trained RL models, wherein each RL model in the set of RL models is defined 20 for (i) a composition selection from the one or more composition elements, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes; receiving the target value of each of the one or more properties of a desired material; and passing the target value of each of the one or more properties of the desired material, to the set of trained RL 25 models, to sequentially predict (i) one or more material compositions, and (ii) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, for each of the one or more material compositions, and wherein each of the one or more material compositions comprises the value of 30 each of the one or more composition elements.
4
[006]
In another aspect, a system for automated design of materials and its manufacturing process for desired properties is provided. The system includes: a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the 5 instructions to: receive a historical design data associated to a plurality of materials, from a repository; create a training dataset of each material of the plurality of materials, from the historical design data, to obtain a plurality of training data sets associated with the plurality of materials, wherein the training data set of each of the plurality of materials comprises (i) an achieved value of each of one or more 10 properties of the material, (ii) a value of each of one or more composition elements present in the material, (iii) a value of each of one or more process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the material; receive a target value of each of the one or more properties of each material of the plurality of 15 materials; train a set of reinforcement learning (RL) models, using the plurality of training data sets and the target value of each of the one or more properties of each material of the plurality of materials to obtain a set of trained RL models, wherein each RL model in the set of RL models is defined for (i) a composition selection from the one or more composition elements, and (ii) each of the one or more 20 sequential manufacturing process steps present in each of the one or more manufacturing process routes; receive the target value of each of the one or more properties of a desired material; and pass the target value of each of the one or more properties of the desired material, to the set of trained RL models, to sequentially predict (i) one or more material compositions, and (ii) the value of each of the one 25 or more process parameters of each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, for each of the one or more material compositions, and wherein each of the one or more material compositions comprises the value of each of the one or more composition elements. 30
5
[007]
In yet another aspect, there is provided a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: receive a historical design data associated to a plurality of materials, from a repository; create 5 a training dataset of each material of the plurality of materials, from the historical design data, to obtain a plurality of training data sets associated with the plurality of materials, wherein the training data set of each material of the plurality of materials comprises (i) an achieved value of each of one or more properties of the material, (ii) a value of each of one or more composition elements present in the 10 material, (iii) a value of each of one or more process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the material; receive a target value of each of the one or more properties of each material of the plurality of materials; train a set of reinforcement learning (RL) models, using the plurality of training 15 data sets and the target value of each of the one or more properties of each material of the plurality of materials to obtain a set of trained RL models, wherein each RL model in the set of RL models is defined for (i) a composition selection from the one or more composition elements, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing 20 process routes; receive the target value of each of the one or more properties of a desired material; and pass the target value of each of the one or more properties of the desired material, to the set of trained RL models, to sequentially predict (i) one or more material compositions, and (ii) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps 25 present in each of the one or more manufacturing process routes, for each of the one or more material compositions, and wherein each of the one or more material compositions comprises the value of each of the one or more composition elements.
[008]
In an embodiment, wherein training the set of RL models, using the plurality of training data sets to obtain the set of trained RL models, comprises: 30 defining a global reward function of the set of RL models, as a weighted function
6
of the one or more properties of the material, and one or more characteristics of the
material; defining a local reward function of each RL model in the set of RL models, as a weighted function of one or more of: (i) one or more sequential manufacturing process step specific constraints, (ii) one or more evaluation metrics, (iii) one or more material structure state constraints, and (iv) the one or more characteristics of 5 the material that are dependent of each sequential manufacturing process step, and wherein the one or more evaluation metrics comprises one or more environment, social, and governance (ESG) norms, one or more economic indices, and one or more manufacturability indices; defining a training environment of each RL model in the set of RL models, using the local reward function of associated RL model, 10 and by utilizing a set of computational simulation models, wherein the training environment of each RL model determines a next state, the value of the local reward function, and a learning episode completion status, for a given state, based on an action taken by the corresponding RL model; transforming each training data set of the plurality of training data sets along with the target value of each of the one or 15 more properties of each material of the plurality of materials, at a time, to determine (i) a state, (ii) the next state, and (iii) the action, (iv) the reward, and (v) the learning episode completion status, of each RL model in the set of RL models; and iteratively training each RL model in the set of RL models, defined for (i) the composition selection, and (ii) each of the one or more sequential manufacturing process steps 20 present in each of the one or more manufacturing process routes, in a reverse sequential order, by utilizing the training environment of the associated RL model and each training dataset present in the plurality of training data sets, to obtain the set of trained RL models.
[009]
In an embodiment, the state of each RL model of the composition 25 selection, and each sequential manufacturing process step, is defined with respect to one or more of (i) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps that are precedent to the corresponding sequential manufacturing process step present in each manufacturing process route, (ii) the value of each of the one or more composition 30 elements, (iii) the target value of each of one or more properties of the material, (iv)
7
the one or more evaluation metrics, and (v) one or more material structure state
parameters.
[010]
In an embodiment, the action of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to the value of each of one or more composition elements present in the material, 5 the value of each of the one or more process parameters of the corresponding sequential manufacturing process step present in each manufacturing process route, respectively.
[011]
In an embodiment, the reward of each RL model is defined as a sum of a value of the local reward function of the associated RL model and the value of 10 the global reward function.
[012]
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
15
BRIEF DESCRIPTION OF THE DRAWINGS
[013]
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[014]
FIG. 1 is an exemplary block diagram of a system for automated 20 design of materials and its manufacturing process for desired properties, in accordance with some embodiments of the present disclosure.
[015]
FIG. 2 is an exemplary block diagram illustrating a plurality of modules of the system of FIG. 1, for automated design of materials and its manufacturing process for desired properties, in accordance with some 25 embodiments of the present disclosure.
[016]
FIGS. 3A-3B illustrate exemplary flow diagrams of a processor-implemented method for automated design of materials and its manufacturing process for desired properties, using the system of FIG. 1, in accordance with some embodiments of the present disclosure. 30
8
[017]
FIG. 4 is a flowchart comprising steps for creating a plurality of training datasets of a plurality of materials, from the historical design data, in accordance with some embodiments of the present disclosure.
[018]
FIG. 5A shows an exemplary design framework for automated design of materials and manufacturing process designs for targeted requirements, 5 in accordance with some embodiments of the present disclosure.
[019]
FIG. 5B shows exemplary conventional manufacturing process routes for producing a hot rolled steel sheet.
[020]
FIG. 6 shows an exemplary setup of the RL problem for a material design decision making problem, in accordance with some embodiments of the 10 present disclosure.
[021]
FIG. 7 is a flowchart comprising steps for training the set of RL models, using the plurality of training data sets to obtain the set of trained RL models, in accordance with some embodiments of the present disclosure.
[022]
FIG. 8 is a flowchart comprising steps for Error! Reference source n15 ot found., in accordance with some embodiments of the present disclosure.
[023]
FIG. 9 is a flowchart comprising steps for defining a training environment for each RL model using the plurality of training datasets, in accordance with some embodiments of the present disclosure.
[024]
FIG. 10 shows an exemplary training strategy of the set of RL 20 models, in accordance with some embodiments of the present disclosure.
[025]
FIG. 11 is a flowchart comprising steps for generative design approach to generate multiple material and its manufacturing process designs for a given requirement, in accordance with some embodiments of the present disclosure.
[026]
FIG. 12 shows RL model architecture adopted for hot rolled steel 25 sheet design, in accordance with some embodiments of the present disclosure.
[027]
FIG. 13 shows design solutions generated by the RL model architecture adopted for hot rolled steel sheet design of FIG. 12, for a given requirement, in accordance with some embodiments of the present disclosure.
[028]
FIG. 14 shows comparison results of the design solutions generated 30 by the RL model architecture adopted for hot rolled steel sheet design of FIG. 12,
9
and the domain expert
s, in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
[029]
Exemplary embodiments are described with reference to the 5 accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are 10 possible without departing from the scope of the disclosed embodiments.
[030]
Materials are designed for a specific application which requires a set of target properties. While composition selection is a starting point of materials design, it is not independent of its manufacturing route as the changes imparted by the manufacturing route on its structure would determine the final properties of the 15 material. The composition of the material plays a key role during the manufacturing process as the material evolves into the final component which imparts the properties desired. Hence materials are often co-designed along with the manufacturing process. There can be multiple manufacturing process routes feasible to develop a material. Each manufacturing process route often involves 20 multiple sequential manufacturing process steps (or alternatively referred hereafter as processing steps or process steps).
[031]
For example, a hot rolled steel sheet can be produced through two different manufacturing routes viz., a conventional casting route or a thin slab casting route depending on the final property requirements. The design process to 25 develop a hot rolled steel sheet through the conventional casting route may involve sequential manufacturing process steps such as composition selection followed by various process steps such as casting, reheating, rough rolling, finish rolling, and ROT & coiling whereas for the thin slab casting route, rough rolling step is not present. Hence, to design the final target material, it becomes imperative to consider 30 the composition of the material and the steps in its manufacturing process together
10
with objectives on its target properties, the cost, the manufacturability etc. in
addition to various constraints arriving from environment, social, & governance (ESG) norms, production capabilities, and so on. This generally involves extensive experimentation, characterization and testing to arrive at the desired materials and process design satisfying the required target conditions. 5
[032]
Below are the major challenges and complexities associated with automating the materials and manufacturing process design:
•
The properties achieved by the material depend on both the composition and the structure. The structure depends on processing steps and processing parameters. Therefore, both composition and manufacturing processes 10 need to be designed to achieve the required properties.
•
The materials and their manufacturing process design is a vast design space problem due to the variable space and combinations.
•
Multiple processing routes can be followed to develop a material for a given requirement. 15
•
Manufacturability, ESG norms, and cost aspects need to be integrated into the design process.
•
The composition-process structure-property relations that exist in materials design problems are complex and not well understood.
•
Limited availability of high quality and comprehensive materials data that 20 are most often proprietary in nature.
•
Considerable time for materials design and development owing to large number of experimental trails required and/or computationally expensive simulation models (empirical/analytical/physics-based) in use.
[033]
Most of the conventional automated materials design techniques 25 have evolved from experience/experiments based to traditional empirical methods to more theoretical physics-based models and machine learning techniques. The traditional empirical/semi-empirical models-based design involved extensive experimental trials and utilized the designer’s experience and composition-process-structure-property relations known. This technique resulted in a linear exploration 30
11
of the design space through design of experiments. Automated characterization and
property measurement tools got introduced to accelerate experimentation.
[034]
The recent frameworks such as ICME and materials genome frameworks combined theoretical physics-based models with computational tools and material informatics. Computational design helps to optimize material selection 5 for specific product designs. Material informatics leverage material databases to build AI/ML models to learn relations. Physics based models have made significant advancements in recent times. These models incorporate multi scale modeling of materials from atomistic, nano, micro scales to macro scales. Materials genome engineering combines high throughput computations (HTC), high throughput 10 experimentation, and big data. HTC is used for generating a pool of material designs and virtual screening of material designs to reduce experimental trials. Data mining and AI or ML techniques are used to improve HTC's efficiency further. However, HTC requires high computational cost and resource requirements.
[035]
Designers use a combination of experience, computational models, 15 and extensive experimentations to arrive at the materials and process design for a given requirement. The various technology gaps that exist in the industry are as follows:
•
The existing automated materials design systems do not perform an integrated design of material compositions and their manufacturing 20 processing steps. Most of the present methods focus on correlating between composition of the materials and their properties, while every processing step involved may also affect the underlying structure and thereby the properties of the developed material.
•
The existing automated materials design systems do not consider multiple 25 manufacturing processing routes feasible.
•
The existing automated materials design systems do not integrate manufacturability, ESG norms, cost into the decision-making process.
•
The data scarcity associated with specific materials design problems limits the effectiveness of current design systems. 30
12
[036]
The present disclosure solves the technical problems in the art with the methods and systems for automated design of materials and its manufacturing process for desired properties. The present disclosure addresses this gap by using a multi-agent setup for automated design, wherein a distinct Reinforcement learning (RL) agent is used to mirror the composition selection (CS) and various sequential 5 manufacturing process steps (PS) involved in its manufacturing route. The distinct RL agents learn from both past design data and computational models (empirical/analytical/physics-based models) representing the design process. The present disclosure also integrates other important parameters such as manufacturability, ESG norms, cost, process energy etc. and their relative 10 importance into the design decision making process of the RL agents by expressing them as reward components upon which the RL agents are trained.
[037]
Referring now to the drawings, and more particularly to FIG. 1 through FIG. 14, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments, and 15 these embodiments are described in the context of the following exemplary systems and/or methods.
[038]
FIG. 1 is an exemplary block diagram of a system 100 for automated design of materials and its manufacturing process for desired properties, in accordance with some embodiments of the present disclosure. In an embodiment, 20 the system 100 includes or is otherwise in communication with one or more hardware processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the one or more hardware processors 104. The one or more hardware processors 104, the memory 102, and the I/O interface(s) 106 may be coupled to a 25 system bus 108 or a similar mechanism.
[039]
The I/O interface(s) 106 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface (GUI), and the like. The I/O interface(s) 106 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a 30 keyboard, a mouse, an external memory, a plurality of sensor devices, a printer and
13
the like. Further, the I/O interface(s) 106 may enable system 100 to communicate
with other devices, such as web servers and external databases.
[040]
The I/O interface(s) 106 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as 5 Wireless LAN (WLAN), cellular, or satellite. For the purpose, the I/O interface(s) 106 may include one or more ports for connecting a number of computing systems with one another or to another server computer. Further, the I/O interface(s) 106 may include one or more ports for connecting a number of devices to one another or to another server. 10
[041]
The one or more hardware processors 104 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and 15 execute computer-readable instructions stored in the memory 102. In the context of the present disclosure, the expressions ‘processors’ and ‘hardware processors’ may be used interchangeably. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, portable computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a 20 network cloud and the like.
[042]
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable 25 ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 102 includes a plurality of modules 102a and a repository 102b for storing data processed, received, and generated by one or more of the plurality of modules 102a. The plurality of modules 102a may include routines, programs, objects, components, data structures, and so on, which perform particular 30 tasks or implement particular abstract data types.
14
[043]
The plurality of modules 102a may include programs or computer-readable instructions or coded instructions that supplement applications or functions performed by the system 100. The plurality of modules 102a may also be used as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. 5 Further, the plurality of modules 102a can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 104, or by a combination thereof. In an embodiment, the plurality of modules 102a can include various sub-modules (not shown in FIG. 1). Further, the memory 102 may include information pertaining to input(s)/output(s) of each step performed by the 10 processor(s) 104 of the system 100 and methods of the present disclosure.
[044]
The repository 102b may include a database or a data engine. Further, the repository 102b amongst other things, may serve as a database or includes a plurality of databases for storing the data that is processed, received, or generated as a result of the execution of the plurality of modules 102a. Although 15 the repository 102b is shown internal to the system 100, it will be noted that, in alternate embodiments, the repository 102b can also be implemented external to the system 100, where the repository 102b may be stored within an external database (not shown in FIG. 1) communicatively coupled to the system 100. The data contained within such external databases may be periodically updated. For 20 example, data may be added into the external database and/or existing data may be modified and/or non-useful data may be deleted from the external database. In one example, the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS). In another embodiment, the data stored in the 25 repository 102b may be distributed between the system 100 and the external database.
[045]
Referring collectively to FIG. 2 and FIGS. 3A-3B, components and functionalities of the system 100 are described in accordance with an example embodiment of the present disclosure. For example, FIG. 2 is an exemplary block 30 diagram illustrating the plurality of modules 102a of the system 100 of FIG. 1, for
15
automated design of materials and its manufacturing process for desired properties
, in accordance with some embodiments of the present disclosure. In an embodiment, the plurality of modules 102a include a data pre-processing module 202, a training module 204, and a prediction module 206.
[046]
For example, FIGS. 3A-3B illustrate exemplary flow diagrams of a 5 processor-implemented method 300 for automated design of materials and its manufacturing process for desired properties, using the system 100 of FIG. 1, in accordance with some embodiments of the present disclosure. Although steps of the method 300 including process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be 10 configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any practical order. Further, some steps may be performed simultaneously, or some steps may be performed alone or independently. 15
[047]
At step 302 of method 300, the one or more hardware processors 104 of the system 100 are configured to receive historical design data associated to a plurality of materials, from a repository 102b. In an embodiment, the historical design data includes past design data and/or the design data that is generated from design of experiments (DOE) analysis. The historical design data is stored in 20 repository 102b.
[048]
More specifically the historical design data of each material includes a composition, number and proportions of the composition elements present in the composition, the manufacturing route employed, the number of manufacturing process steps present in the manufacturing route employed, process parameters of 25 each manufacturing process step, achieved properties of the material, other parameters considered during the manufacturing process, and so on.
[049]
At step 304 of the method 300, the one or more hardware processors 104 of the system 100 are configured to create a training dataset of each material of the plurality of materials, from the historical design data received at step 302 of the 30 method 300. A plurality of training data sets corresponding to the plurality of
16
materials is obtained as output of this step.
In this step, the historical design data received at step 302 of the method 300 is pre-processed to obtain the plurality of training data sets. In an embodiment, the data pre-processing module 202 is configured to pre-process and create the plurality of training data sets corresponding to the plurality of materials from the historical design data. 5
[050]
FIG. 4 is a flowchart comprising steps for creating a plurality of training datasets of a plurality of materials, from the historical design data, in accordance with some embodiments of the present disclosure. As shown in FIG. 4, pre-processing the historical design data includes but is not limited to data segregation, filling missing data, data normalization, and outlier removal. After the 10 data pre-processing, the training data contains the plurality of training data sets corresponding to the plurality of materials.
[051]
The training data set of each material includes (i) an achieved value of each of one or more properties of a material, (ii) a value of each of one or more composition elements present in the material, (iii) a value of each of one or more 15 process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the material.
[052]
In an embodiment, the one or more properties of the material indicate material properties including but are not limited to a set of mechanical 20 properties such as a yield strength (YS), an ultimate tensile strength (UTS), and a percentage of uniform elongation, and other specific properties such as a fracture toughness. The achieved value of each property of the material includes the property value of each property that the material exhibits after its production.
[053]
In an embodiment, the one or more composition elements present in 25 the material indicate the elements that are part of the composition of the given material. For example, carbon (C), silicon (Si), iron (Fe), chromium (Cr), and nickel (Ni) are some of the elements (referred to as composition elements) in a material composition. The value of each of one or more composition elements refers to the composition value, a proportion value, or a percentage value of each such 30 composition element.
17
[054]
In an embodiment, the one or more manufacturing process routes indicate the manufacturing process routes that are followed/executed for making such material. There may be a number of manufacturing process routes for making a single material. Further, the one or more sequential manufacturing process steps present in each manufacturing process route indicate the manufacturing process 5 steps that are present in a given single route. Naturally, the manufacturing process steps are executed in a sequential manner (one process after the other in a sequential order), hence, they are called the sequential manufacturing process steps.
[055]
FIG. 5A shows an exemplary design framework for automated design of materials and manufacturing process designs for targeted requirements, 10 in accordance with some embodiments of the present disclosure. The material and its manufacturing process design consists of selecting the composition of the material (composition selection) followed by design of various subsequent processing steps to achieve the specified target properties and desired characteristics. The materials can be developed through multiple manufacturing 15 process routes which indicates a difference in the processing steps involved. The design framework augments the current design process by providing multiple design alternatives to the designer for a given requirement.
[056]
As shown in FIG. 5A, the composition selection is a prior step before the manufacturing process route to design any material. There are k number of 20 manufacturing process routes (route 1, route 2,…, route k) defined where each route contain some manufacturing process steps. For example, in route 1 of FIG. 5A, there are n number of manufacturing process steps (process step 1_1, process step 1_2, …, process step 1_n).
[057]
FIG. 5B shows exemplary conventional manufacturing process 25 routes for producing a hot rolled steel sheet. As shown in FIG. 5B, there are two manufacturing process routes (route 1 and route 2) possible to produce the material exhibiting the desired properties. In manufacturing process route 1, the one or more sequential manufacturing process steps are conventional casting, reheating, rough rolling, finish rolling, and ROT & coiling. Because these manufacturing process 30 steps are sequential, the first manufacturing process step (after the composition
18
selection)
in route 1 is the conventional casting, next the reheating, and so on the last manufacturing process step is the ROT & coiling. Similarly, the one or more sequential manufacturing process steps in the manufacturing process route 2 are thin slab casting, reheating, finish rolling, and ROT & coiling.
[058]
The one or more process parameters of each sequential 5 manufacturing process step indicate the process parameters defined for each sequential manufacturing process step based on the nature of the corresponding manufacturing process. The number of parameters and the variety of parameters may not be same for each sequential manufacturing process step. For example, the process parameters of the reheating process step are a holding temperature, a 10 heating rate, and a holding time. Similarly, the process parameters for the thin slab casting process step are a casting speed, a melt superheat, a nozzle diameter, and a cooling rate. The value of each process parameter corresponding to the sequential manufacturing process step indicates the parameter value in a given measurement units. 15
[059]
At step 306 of the method 300, the one or more hardware processors 104 of the system 100 are configured to receive a target value of each of the one or more properties of each material of the plurality of materials. The target value of each property of the material refers to the target property value of each property that the given material should have. In an embodiment, the target value of each 20 property of the material is provided randomly based on a strategy defined by a material designer or a user.
[060]
At step 308 of the method 300, the one or more hardware processors 104 of the system 100 are configured to train a set of reinforcement learning (RL) models, to obtain a set of trained RL models. The plurality of training data sets 25 obtained at step 304 of the method 300 and the target value of each of the one or more properties of each material received at step 306 of the method 300. In an embodiment, the training module 204 is configured to train the set of reinforcement learning (RL) models with the plurality of training data sets to obtain the set of trained RL models. 30
19
[061]
Each RL model is independent, and the specific RL algorithm and architecture of neural networks used (number of hidden layers, activation function, etc.) can be chosen appropriately for the given manufacturing process step/composition selection problem. Each RL model in the set of RL models is defined for (i) a composition selection that contain information about one or more 5 composition elements present in the corresponding material, and (ii) each of the one or more sequential manufacturing process steps present in each manufacturing process route. For example, to follow the manufacturing process route 1 of FIG. 5B, a set of 6 RL models are defined where one RL model is defined for the composition selection, and one each of the remaining 5 RL models are defined for 10 the conventional casting, the reheating, the rough rolling, the finish rolling, and the ROT & coiling. Similarly, to follow the manufacturing process route 2 of FIG. 5B, a set of 5 RL models are defined where one RL model is defined for the composition selection, and one each of the remaining 4 RL models are defined for the thin slab casting, the reheating, the finish rolling, and the ROT & coiling. 15
[062]
Thus, the framework of the present disclosure employs the RL models (interchangeably referred to as RL agents) for taking the various decisions in the design problem. The model architecture of the present disclosure consists of a modular multi-agent setup for automated design, wherein a distinct RL model is assigned to make decisions on the composition selection (CS) and various 20 sequential manufacturing process steps (PS) involved in its manufacturing process route. Each RL model is a distinct RL agent and therefore the terms ‘RL model’ and ‘RL agent’ are used interchangeably in this description. The RL agents work in a sequential fashion to generate the material composition and manufacturing process parameters for each of the manufacturing process steps present in the given 25 manufacturing process route for a given target requirement. The RL agents have an inherent hierarchy as per the position of the processing step in the design process.
[063]
FIG. 6 shows an exemplary setup of the RL problem for a materials design decision making problem, in accordance with some embodiments of the present disclosure. Referring to FIG. 6, the composition selection (CS) / 30 manufacturing processing step (PS) problem can be configured as an RL problem
20
as follows. An RL problem consists of an agent learning a policy by working with
an environment. For a given state condition, the agent performs actions on the environment and the environment provides the next state, reward, and done information accordingly. In the case of a design decision problem such as the composition selection (CS) or processing step (PS) in a material and their 5 manufacturing process design, the environment is composed of one or more simulation model(s) for the specified process. The simulation model can be an analytical model, empirical model, machine learning (ML) model, or any other type of physics-based model. For example, a casting process environment can be composed of an empirical casting model which can evaluate the microstructure of 10 the cast after casting process if the composition of the material and the casting process variables are given. The state of a CS/PS environment consist of the following in the present disclosure:
1.
The composition/process parameters (if relevant) of the upstream decision problem already selected. 15
2.
Evaluation metrics of the processing step. This can be composed of one or more sustainability indices, cost, process/equipment constraints, and structure state constraints related to the processing step.
3.
Structure state variables: these are parameters that specify the current state of the structure of the material. 20
4.
The target properties: the final property values to be achieved which is part of the requirements for the materials and manufacturing process design.
[064]
The actions of the RL agent for a CS/PS consist of various design variables of the problem. In the case of a CS, the composition values of the elements 25 in the material constitute the action. In the case of a PS, the design variables consist of the process parameters of the specified process. For example, for the reheating process, the design variables can be holding time and holding temperature. The reward provided by a CS/PS environment is the local reward component of the total reward that the RL agent receives. 30
21
[065]
Thus, the rewards of these RL agents consist of two components: local rewards (the reward of the specific CS/PS problem) and global rewards. The local rewards take account of the performance of the agent based on the set of criteria specific to the process under consideration. The global reward accounts for the satisfaction of target properties and desired overall characteristics of the 5 material. Upon training based on this reward setup, an RL agent learns to set the process variables of the given decision process to satisfy both the process specific expectations and contribute to the overall properties. If some of the decision problems are common to multiple processing routes, the same RL agent can be used for the problems in all the routes. However, the RL problem needs to be configured 10 accordingly.
[066]
In the case of a CS problem, the local reward consists of one or more of:
1.
Composition dependent sustainability indices such as elemental extraction energy and a carbon footprint. 15
2.
Cost of the material based on composition.
3.
Material properties solely dependent on composition.
[067]
In the case of a PS, the local rewards consist of one or more of the following:
1.
Process specific contributions towards sustainability indices. For 20 example, energy consumption during a rolling process.
2.
Process specific contributions to cost.
3.
Intermediate parameter constraints: It includes one or more of
a.
Material structure state constraints such as the maximum limit on the grain size after a given processing step. 25
b.
Process/equipment related constraints such as limiting the maximum rolling force below a given limit in rolling.
c.
Process end state variable constraints.
[068]
The overall properties of the designed material are evaluated by a property evaluator based on the composition and the microstructure. The global 30 rewards are evaluated based on the satisfaction of property targets and the values
22
of desired properties. For example, the property targets might be
values of a set of mechanical properties, like yield strength (YS), ultimate tensile strength (UTS), and % uniform elongation, to be achieved by the material. However, the fracture toughness can be a property that is desired. In such a case, a positive reward can be given proportional to the value of achieved fracture toughness. 5
[069]
More specifically the state of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect (i) the value of each of the one or more process parameter of each of the one or more sequential manufacturing process steps that are precedent to the corresponding sequential manufacturing process step present in each manufacturing 10 process route, (ii) the value of each of the one or more composition elements, (iii) the target value of each of one or more properties of the material, (iv) the one or more evaluation metrics, and (v) one or more material structure state parameters.
[070]
The action of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to the value of each 15 of one or more composition elements present in the material, the value of each of the one or more process parameters of the corresponding sequential manufacturing process step present in each manufacturing process route, respectively. The reward (or the total reward) of each RL model is defined as a sum of the value of the local reward function of the corresponding RL model and the value of the global reward 20 function.
[071]
FIG. 7 is a flowchart comprising steps for training the set of RL models, using the plurality of training data sets to obtain the set of trained RL models, in accordance with some embodiments of the present disclosure. As shown in FIG. 7, training the set of RL models, using the plurality of training data sets to 25 obtain the set of trained RL models is further explained through steps 308a to 308e.
[072]
At step 308a, a global reward function of the set of RL models is defined. The global reward function is defined as a weighted function of the one or more properties of the material, and one or more characteristics of the material.
[073]
At step 308b, a local reward function of each RL model in the set of 30 RL models, is defined. The local reward function of each RL model is defined a
23
weighted function of one or more of: (i) one or more sequential manufacturing
process step specific constraints, (ii) one or more evaluation metrics, (iii) one or more material structure state constraints, and (iv) the one or more characteristics of the material that are dependent of each sequential manufacturing process step. The one or more evaluation metrics include one or more environment, social, and 5 governance (ESG) norms, one or more economic indices, and one or more manufacturability indices.
[074]
The weighted function of both the local reward function and the global reward function indicates the relative weights of each parameter present in the corresponding weighted function. The relative importance of various reward 10 components is set through weights assignment. The objective here is to capture the relative importance of variables, make each agent rewarded a competing mix of local and global rewards, and thereby enforce the problem constraints and satisfy the target properties effectively.
[075]
FIG. 8 is a flowchart comprising steps for normalizing and assigning 15 relative weights to reward components, in accordance with some embodiments of the present disclosure. As shown in FIG. 8, the initial steps involve analyzing the working range (lower and upper limits) of all reward components and normalizing each component based on the working range. The working range can be estimated by carrying out the design of experiments (DoE) analysis in the design space of the 20 material and its manufacturing process design problem. Alternatively, it can be determined from past data, experience, and the allowable ranges of various composition and processing variables. The normalizing can be done to a range of 0 to 1 assuming each parameter to be uniformly distributed between the identified working ranges. 25
[076]
There are four types of weights which can be assigned as given below:
1.
Relative weights for various reward components within a CS/PS: This weight assignment incorporates the relative importance of various sustainability indices, cost, and intermediate parameter constraints which is 30 set based on experience.
24
2.
Relative weights for final properties (YS, UTS, etc.): If achieving some of the final properties is more important than others, it can be captured using these weights.
3.
Relative weights between local and global rewards: This signifies the weightage given to CS/PS specific rewards and rewards corresponding to 5 overall desirable properties achieved. This does not involve the various constraints of specific CS/PS problems.
4.
Relative constraint penalty factors to set importance and relevance of various problem constraints.
[077]
In the present disclosure, the reward components for specific 10 constraints of the problem are expressed as fuzzy functions. The constraints include process step specific constraints such as ‘grain size at the end of casting’ should be less than a given value and overall property targets such as ‘yield strength’ should be greater than a given value, and so on. The constraints can be of two different extremes, either a lower limit or an upper limit. For an upper limit (higher limit) 15 constraint, i.e. the parameter (𝑥) value should be less than the maximum value (𝑥𝑢𝑙), a negative reward is assigned when the constraint is not satisfied, i.e. 𝑥 >= 𝑥𝑢𝑙. Then the functional representation of the reward component for such specific constraint is given as,
𝑓𝑢𝑙(𝑥)={ 0𝑖𝑓 𝑥≤(1−𝛼)𝑥𝑢𝑙−1𝑖𝑓 𝑥≥(1+𝛼)𝑥𝑢𝑙(1−𝛼)𝑥𝑢𝑙−𝑥2𝛼𝑥𝑢𝑙𝑖𝑓 𝑥>(1−𝛼)𝑥𝑢𝑙 𝑎𝑛𝑑 𝑥<(1+𝛼)𝑥𝑢𝑙} 20
[078]
For a lower limit constraint, i.e. the parameter (𝑥) value should be more than the minimum value (𝑥𝑙𝑙), a negative reward is assigned when the constraint is not satisfied, i.e. 𝑥<= 𝑥𝑙𝑙. Then the functional representation of the reward component for such specific constraint is given as,
𝑓𝑙𝑙(𝑥)={ 0𝑖𝑓 𝑥≥(1+𝛼)𝑥𝑙𝑙−1𝑖𝑓 𝑥≤(1−𝛼)𝑥𝑙𝑙𝑥−(1−𝛼)𝑥𝑙𝑙2𝛼𝑥𝑙𝑙−1𝑖𝑓 𝑥>(1−𝛼)𝑥𝑙𝑙 𝑎𝑛𝑑 𝑥<(1+𝛼)𝑥𝑙𝑙} 25
25
Where, 𝛼 is the limit of the fuzzy zone. For example, for a fuzzy limit of ±𝑓𝑥 % with respect to either the upper or lower limit, the value of alpha (𝛼) is given as, 𝛼=𝑓𝑥100
[079]
For certain parameters, the reward is based upon the normalized value of the parameter with respect to a maximum achievable value. For such 5 parameters, the requirement is that the value should be as low as possible. The normalization reward function is given as,
𝑓𝑛𝑜𝑟𝑚(𝑥,𝑥𝑚𝑎𝑥)=−𝑥𝑥𝑚𝑎𝑥
[080]
In an embodiment, the local reward function of each RL model associated to the composition selection consists of normalized elemental extraction 10 energy, carbon footprint, cost, carbon equivalent, and corrosion loss. Hence, the local reward function (𝐿𝑅𝐶𝑆) of the RL model associated to the composition selection is mathematically represented as:
𝐿𝑅𝐶𝑆=𝑤𝑡𝑒𝑒𝑒𝐶𝑆∗𝑓𝑛𝑜𝑟𝑚(𝐸𝐸𝐸𝐶𝑆,𝐸𝐸𝐸𝑚𝑎𝑥𝐶𝑆)+𝑤𝑡𝑐𝑓𝑝𝐶𝑆∗𝑓𝑛𝑜𝑟𝑚(𝐶𝐹𝑃𝐶𝑆,𝐶𝐹𝑃𝑚𝑎𝑥𝐶𝑆)+𝑤𝑡𝑐𝑜𝑠𝑡𝐶𝑆∗𝑓𝑛𝑜𝑟𝑚(𝐶𝑜𝑠𝑡𝐶𝑆,𝐶𝑜𝑠𝑡𝑚𝑎𝑥𝐶𝑆)+𝑤𝑡𝑐𝑒𝐶𝑆15 ∗𝑓𝑛𝑜𝑟𝑚(𝐶𝐸𝐶𝑆,𝐶𝐸𝑚𝑎𝑥𝐶𝑆)+𝑤𝑡𝑐𝑟𝑙𝑜𝑠𝑠𝐶𝑆∗𝑓𝑛𝑜𝑟𝑚(𝐶𝑅𝑙𝑜𝑠𝑠𝐶𝑆,𝐶𝑅𝑙𝑜𝑠𝑠𝑚𝑎𝑥𝐶𝑆)
Where, 𝐸𝐸𝐸𝐶𝑆 is the elemental extraction energy, 𝐶𝐹𝑃𝐶𝑆 is the carbon footprint, 𝐶𝑜𝑠𝑡𝐶𝑆 is the cost of composition elements, 𝐶𝐸𝐶𝑆 is the carbon equivalent which determines the weldability, and 𝐶𝑅𝑙𝑜𝑠𝑠𝐶𝑆 is the corrosion loss. In an embodiment, exemplary values of the weights are given as follows, 20 𝑤𝑡𝑒𝑒𝑒𝐶𝑆=0.1 𝐸𝐸𝐸𝑚𝑎𝑥𝐶𝑆=2862.58 𝑤𝑡𝑐𝑓𝑝𝐶𝑆=0.1 𝐶𝐹𝑃𝑚𝑎𝑥𝐶𝑆=246.2455 𝑤𝑡𝑐𝑜𝑠𝑡𝐶𝑆=0.25 25 𝐶𝑜𝑠𝑡𝑚𝑎𝑥𝐶𝑆=56.0239 𝑤𝑡𝑐𝑒𝐶𝑆=0.15 𝐶𝐸𝑚𝑎𝑥𝐶𝑆=0.625 𝑤𝑡𝑐𝑟𝑙𝑜𝑠𝑠𝐶𝑆=0.15
26
𝐶𝑅𝑙𝑜𝑠𝑠𝑚𝑎𝑥𝐶𝑆=175.4125
[081]
In an embodiment, the local reward function of each RL model associated to the casting manufacturing process step consists of an upper limit constraint for grain size and secondary dendritic arm spacing obtained after casting. The fuzzy limit considered for the calculation of the local reward is ±10 %. Hence, 5 the local reward function (𝐿𝑅𝑐𝑎𝑠𝑡𝑖𝑛𝑔) of the RL model associated to the casting is mathematically represented as:
𝐿𝑅𝑐𝑎𝑠𝑡𝑖𝑛𝑔=𝑤𝑡𝑔𝑠𝑐𝑎𝑠𝑡𝑖𝑛𝑔∗𝑓𝑢𝑙(𝐺𝑆𝑐𝑎𝑠𝑡𝑖𝑛𝑔)+𝑤𝑡𝑠𝑑𝑎𝑠𝑐𝑎𝑠𝑡𝑖𝑛𝑔∗𝑓𝑢𝑙(𝑆𝐷𝐴𝑆𝑐𝑎𝑠𝑡𝑖𝑛𝑔)
Where, 𝐺𝑆𝑐𝑎𝑠𝑡𝑖𝑛𝑔 is the grain size and 𝑆𝐷𝐴𝑆𝑐𝑎𝑠𝑡𝑖𝑛𝑔 is the secondary dendritic arm spacing after casting. The exemplary values of the weights are given as follows, 10 𝑤𝑡𝑔𝑠𝑐𝑎𝑠𝑡𝑖𝑛𝑔=2.0 𝑤𝑡𝑠𝑑𝑎𝑠𝑐𝑎𝑠𝑡𝑖𝑛𝑔=2.0
[082]
In an embodiment, the local reward function of each RL model associated to the reheating manufacturing process step consists of an upper limit constraint for grain size and a normalized energy required for reheating. The fuzzy 15 limit considered for the calculation of the local reward is ±10 %. Hence, the local reward function (𝐿𝑅𝑅𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔) of each RL model associated to the reheating is mathematically represented as:
𝐿𝑅𝑅𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔=𝑤𝑡𝑔𝑠𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔∗𝑓𝑢𝑙(𝐺𝑆𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔)+𝑤𝑡𝐸𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔∗𝑓𝑛𝑜𝑟𝑚(𝐸𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔,𝐸𝑚𝑎𝑥𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔) 20
Where, 𝐺𝑆𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔 is the austenite grain size after reheating, and 𝐸𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔 is the energy consumed during reheating. The exemplary values of the weights are given as follows, 𝑤𝑡𝑔𝑠𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔=4.0 𝑤𝑡𝐸𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔=0.5 25 𝐸𝑚𝑎𝑥𝑟𝑒ℎ𝑒𝑎𝑡𝑖𝑛𝑔=1000
[083]
In an embodiment, the local reward function of each RL model associated to the rough rolling manufacturing process step consists of an upper limit constraint for grain size and roll force at each roll pass, lower limit constraint for
27
rough rolling exit temperature and a normalized energy required for rough rolling.
The fuzzy limit considered for the calculation of the local reward is ±10 %. The local reward function (𝐿𝑅𝑅𝑅) of the RL model associated to the rough rolling is mathematically represented as:
𝐿𝑅𝑅𝑅=𝑤𝑡𝑔𝑠𝑅𝑅∗𝑓𝑢𝑙(𝐺𝑆𝑅𝑅)+𝑤𝑡𝑟𝑟𝑡𝑅𝑅∗𝑓𝑙𝑙(𝑅𝑅𝑇𝑅𝑅)+𝑤𝑡𝐸𝑅𝑅∗𝑓𝑛𝑜𝑟𝑚(𝐸𝑅𝑅,𝐸𝑚𝑎𝑥𝑅𝑅)+5 Σ𝑤𝑡𝑟𝑓𝑅𝑅∗𝑓𝑢𝑙(𝑅𝐹𝑖𝑅𝑅)𝑛𝑅𝑅𝑖=1
Where, 𝑛𝑅𝑅 is the number of roll passes, 𝐺𝑆𝑅𝑅 is the austenite grain size, 𝑅𝑅𝑇𝑅𝑅 is the rough rolling exit temperature, 𝐸𝑅𝑅 is the energy consumed for deformation and 𝑅𝐹𝑖𝑅𝑅 is the roll force corresponding to the roll pass 𝑖. The exemplary values of the weights are given as follows, 10 𝑤𝑡𝑔𝑠𝑅𝑅=3.2 𝑤𝑡𝑟𝑟𝑡𝑅𝑅=2.0 𝑤𝑡𝐸𝑅𝑅=0.5 𝐸𝑚𝑎𝑥𝑅𝑅=11.0 𝑤𝑡𝑟𝑓𝑅𝑅=7.5 15
[084]
In an embodiment, the local reward function of each RL model associated to the finish rolling manufacturing process step consists of an upper limit constraint for grain size, exit strip velocity, and roll force at each roll pass, lower limit constraint for finish rolling exit temperature and a normalized energy required for finish rolling. The fuzzy limit considered for the calculation of the local reward 20 is ±10 %. The local reward function (𝐿𝑅𝐹𝑅) of each RL model associated to the finish rolling is mathematically represented as:
𝐿𝑅𝐹𝑅=𝑤𝑡𝑔𝑠𝐹𝑅∗𝑓𝑢𝑙(𝐺𝑆𝐹𝑅)+𝑤𝑡𝑣𝑒𝑙𝐹𝑅 ∗𝑓𝑢𝑙(𝑣𝐹𝑅)+𝑤𝑡𝑓𝑟𝑡𝐹𝑅∗𝑓𝑙𝑙(𝐹𝑅𝑇𝐹𝑅)+𝑤𝑡𝐸𝐹𝑅∗𝑓𝑛𝑜𝑟𝑚(𝐸𝐹𝑅,𝐸𝑚𝑎𝑥𝐹𝑅)+Σ𝑤𝑡𝑟𝑓𝐹𝑅∗𝑓𝑢𝑙(𝑅𝐹𝑖𝐹𝑅)𝑛𝐹𝑅𝑖=1
Where, 𝑛𝐹𝑅 is the number of roll passes, 𝐺𝑆𝐹𝑅 is the austenite grain, 𝑣𝐹𝑅 is the exit 25 strip velocity, 𝐹𝑅𝑇𝐹𝑅 is the finish rolling exit temperature, 𝐸𝐹𝑅 is the energy consumed for deformation and 𝑅𝐹𝑖𝐹𝑅 is the roll force corresponding to the roll pass 𝑖. The exemplary values of the weights are given as follows, 𝑤𝑡𝑔𝑠𝐹𝑅=3.2
28
𝑤𝑡𝑣𝑒𝑙𝐹𝑅=2.0 𝑤𝑡𝑓𝑟𝑡𝐹𝑅=7.5 𝑤𝑡𝐸𝐹𝑅=0.5 𝐸𝑚𝑎𝑥𝐹𝑅=11.0 𝑤𝑡𝑟𝑓𝐹𝑅=7.5 5
[085]
In an embodiment, the local reward function of each RL model associated to the ROT & coiling consists of an upper limit constraint for grain size, and pearlite interlamellar spacing, and a lower limit constraint for ferrite volume fraction and coiling temperature. The fuzzy limit considered for the calculation of the local reward is ±10 %. The local reward function (𝐿𝑅𝑟𝑜𝑡) of each RL model 10 associated to the ROT & coiling is mathematically represented as:
𝐿𝑅𝑟𝑜𝑡=𝑤𝑡𝑔𝑠𝑅𝑂𝑇∗𝑓𝑢𝑙(𝐺𝑆𝑅𝑂𝑇)+𝑤𝑡𝑖𝑠𝑙𝑅𝑂𝑇∗𝑓𝑢𝑙(𝐼𝐿𝑆𝑅𝑂𝑇)+𝑤𝑡𝑣𝑓𝑅𝑂𝑇∗𝑓𝑙𝑙(𝑉𝐹𝑅𝑂𝑇)+𝑤𝑡𝑐𝑡𝑅𝑂𝑇∗𝑓𝑙𝑙(𝐶𝑇𝑅𝑂𝑇)
Where, 𝐺𝑆𝑅𝑂𝑇 is the grain size of ferrite, 𝐼𝐿𝑆𝑅𝑂𝑇 is the pearlite interlamellar spacing, 𝑉𝐹𝑅𝑂𝑇 is the volume fraction of ferrite, and 𝐶𝑇𝑅𝑂𝑇 is the coiling 15 temperature. The exemplary values of the weights are given as follows, 𝑤𝑡𝑔𝑠𝑅𝑂𝑇=3.2 𝑤𝑡𝑖𝑠𝑙𝑅𝑂𝑇=3.2 𝑤𝑡𝑣𝑓𝑅𝑂𝑇=3.2 𝑤𝑡𝑐𝑡𝑅𝑂𝑇=7.5 20
[086]
In an embodiment, the global reward function is defined based upon the achieved properties with respect to the design requirements. The global reward function consists of a lower limit constraint for yield strength, ultimate tensile strength, % uniform elongation, and normalized fracture toughness of the final component. The fuzzy limit considered for the calculation of the local reward is 25 ±10 %. The global reward (𝐺𝑅) is given as,
𝐺𝑅=𝑤𝑡𝑦𝑠∗𝑓𝑙𝑙(𝑌𝑆)+𝑤𝑡𝑢𝑡𝑠∗𝑓𝑙𝑙(𝑈𝑇𝑆)+𝑤𝑡𝑒𝑙𝑜𝑛∗𝑓𝑙𝑙(𝐸𝑙𝑜𝑛)+𝑤𝑡𝑘1𝑐∗𝑓𝑛𝑜𝑟𝑚(𝐾1𝑐,𝐾1𝑐𝑚𝑎𝑥)
29
Where, 𝑌𝑆 is the yield strength, 𝑈𝑇𝑆 is the ultimate tensile strength, 𝐸𝑙𝑜𝑛 is the % uniform elongation, and 𝐾1𝑐 is the fracture toughness. The exemplary values of the weights are given as follows,
𝑤𝑡𝑦𝑠=10.0
𝑤𝑡𝑢𝑡𝑠=10.0 5
𝑤𝑡𝑒𝑙𝑜𝑛=10.0 𝑤𝑡𝑘1𝑐=1.0 𝐾1𝑐𝑚𝑎𝑥=600
[087]
In an embodiment, the total reward is defined as the sum of all the local rewards and the global reward. A predefined relative weight is provided to all 10 local reward components excluding the constraints. In an embodiment, an exemplary predefined relative weight provided for all local reward components excluding the constraints is 0.25. Similarly, a predefined relative weight is provided to the global reward components. In an embodiment, the predefined relative weight provided to the global reward components is 1.0. 15
𝑤𝑡𝐿𝑅=0.25
[088]
The relative importance of various constraints of the problem are set through a constraint stiffness (constraint penalty factor). The constraint stiffness for the local rewards is given as follows,
𝑠𝑡𝐹𝑅𝑇=7.5 20 𝑠𝑡𝑅𝐹=7.5 𝑠𝑡𝐶𝑇=7.5 𝑠𝑡𝑜𝑡ℎ𝑒𝑟𝑠=4.0
[089]
The constraint stiffness for the roll force (𝑠𝑡𝑅𝐹) for both rough and finish rolling, FRT (𝑠𝑡𝐹𝑅𝑇) in finish rolling & coiling temperature (𝑠𝑡𝐶𝑇) constraint 25 is taken as 7.5 and for all other cases it is taken as 4.0. As the global reward considers the properties for which the design is done, for all the design requirements, a constraint stiffness is given the highest weightage (for example, 10.0), giving it the highest weightage, as shown below,
𝑠𝑡𝑦𝑠=10.0 30
30
𝑠𝑡𝑢𝑡𝑠=10.0 𝑠𝑡𝑒𝑙𝑜𝑛=10.0
[090]
At step 308c, a training environment for each RL model is defined, using the local reward function of corresponding RL model, and by utilizing a set of computational simulation models. The training environment of each RL model 5 determines a next state, the value of the local reward function, and a learning episode completion status, for a given state, based on an action taken by the corresponding RL model.
[091]
At step 308d, each training data set of the plurality of training data sets, along with the target value of each of the one or more properties of each 10 material of the plurality of materials, is transformed to determine (i) the state, (ii) the next state, and (iii) the action, (iv) the reward, and (v) the learning episode completion status, of each RL model. FIG. 9 is a flowchart comprising steps for defining the training environment for each RL model using the plurality of training datasets, in accordance with some embodiments of the present disclosure. As shown 15 in FIG. 9, each training dataset is transformed into learning experiences upon which the RL agents can learn. The learning experiences comprises the RL model specific training data or in other words, the training data for the composition selection (CS) and the training data for each manufacturing process step (PS). In an embodiment, a data transformation module (202a) (not shown in FIG. 2) of the data pre-20 processing module 202 is configured to transform each training data set into (i) the state, (ii) the next state, and (iii) the action, (iv) the reward, and (v) the learning episode completion status, for each RL model.
[092]
At step 308e, each RL model defined for (i) the composition selection, and (ii) each of the one or more sequential manufacturing process steps 25 present in each of the one or more manufacturing process routes, is iteratively trained, by utilizing the training environment of the corresponding RL model and each training dataset present in the plurality of training data sets, to obtain the set of trained RL models. In the present disclosure, each RL model defined for (i) the composition selection, and (ii) each of the one or more sequential manufacturing 30
31
process steps present in each of the one or more manufacturing process routes, is
iteratively trained in a reverse sequential order.
[093]
FIG. 10 shows an exemplary training strategy of the set of RL models, in accordance with some embodiments of the present disclosure. As shown in FIG. 10, the RL agents for various CS/PS problems are trained in the reverse 5 sequence. i.e., the last PS RL agent is trained first and sequentially the rest of the agents are trained conditionally utilizing the downstream RL agents as trained prediction models in evaluation mode. For example, in the manufacturing process route 1 of FIG. 5B, the RL agent of the last manufacturing process step ‘ROT & coiling’ is trained first. Then, the RL agent of previous manufacturing process step 10 ‘finish rolling’ is trained next, and so on the RL agent of the composition selection is trained at last, to obtain the set of trained RL models of the corresponding composition selection and the manufacturing process steps in each manufacturing process route.
[094]
The set of trained RL models obtained at this step can be used in 15 real-time to produce different design solutions in terms of the composition selection and the corresponding process parameters for the desired material with the desired properties. Hence the set of trained RL models make the present disclosure as a generative design approach which can generate multiple material and its manufacturing process design solutions for a given design requirement. 20
[095]
At step 310 of the method 300, the one or more hardware processors 104 of the system 100 are configured to receive the target value of each of the one or more properties of a desired material. The desired material is the material that is required as an outcome of the manufacturing process and the target value of each property of the desired material refers to the target property value of each property 25 that the desired material should have.
[096]
At step 312 of the method 300, the one or more hardware processors 104 of the system 100 are configured to pass the target value of each of the one or more properties of the desired material received at step 310 of the method 300, to the set of trained RL models obtained at step 308 of the method 300. The set of 30 trained RL models sequentially predict different design solutions (i) the one or more
32
material compositions, and (ii) the value of each of the one or more process
parameter of each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, for each of the one or more material compositions. Here, each of the one or more material compositions comprises the value of each of the one or more composition elements 5 that contain in the desired material after the manufacturing process. In an embodiment, the prediction module 206 is configured to predict different design solutions by the set of trained RL models.
[097]
FIG. 11 is a flowchart comprising steps for generative design approach to generate multiple material and its manufacturing process designs for a 10 given requirement, in accordance with some embodiments of the present disclosure. The generative design approach of the present disclosure starts with gathering design requirements from the designer which predominantly involves target values for various identified properties of the materials. The set of trained RL agents are loaded in the next step. Some of the RL agents may be common to multiple 15 processing routes as the decision problem may be common. The two approaches used to generate multiple design solutions for the given requirement are explained below. The first approach is achieved by allowing the agents to learn on the given problem for multiple learning episodes. The second approach is achieved by adding a small noise to the individual agent predictions. 20
[098]
In the first approach, the RL agents are used to predict the composition and processing parameters for identified processing routes separately. Following each instance of agent predictions, the experiences corresponding to individual agents are determined and added to the memory of respective agents in the prescribed format (state, next state, action, reward, done). The agents are trained 25 with additional experience and the updated agents are used for predictions for the same requirement in the next iteration. This process is iterated for the prescribed number of steps to generate multiple sets of material and its manufacturing process designs for the identified process routes. This approach exploits the understanding of reward landscapes of each individual RL agent and the effect of sequential model 30
33
predictions to generate multiple material and its manufacturing process designs for
the same requirement.
[099]
In the second approach, a small random gaussian noise is added to each agent’s predictions. The material and its manufacturing process designs are generated through sequential prediction of the respective agents for each processing 5 route. This process is iterated for the prescribed number of steps to generate multiple sets of material and its manufacturing process designs for the identified process routes. The approach generates multiple design solutions by slight augmentation of the preferred design space of the learnt agents along with the effect of sequential model predictions. This approach can be applied to RL algorithms 10 with deterministic policies. However, if the policy is stochastic, this approach may not be needed.
[0100]
As shown in FIG. 11, the generative design approach of the present disclosure predicts (m+n) sets of design solutions each containing material and the process design (process parameters) for k manufacturing process routes. The 15 designer may further validate each set of the design solution by comparing with the target properties of the required material and the achieved properties of the material resulted from each set of the design solution, to choose the design solution sets for the experimentation.
[0101]
The present disclosure presents a novel generalized framework as 20 the methods and systems for integrated design of materials and their manufacturing process to achieve given target requirements. The proposed design framework employs distinct RL agents to carry out various design process steps such as composition selection and various manufacturing process steps in a specific way. The RL agents in the framework function in a synergistic manner. Each model 25 makes conditional predictions, informed by the output of its preceding step, thus ensuring a coherent and sequential flow of information and decision-making, akin to the original decision process.
[0102]
The modular approach reduces the complexity of individual models as opposed to a single ML model learning the relationship between various 30 composition and manufacturing process variables, and the final properties, and
34
makes the framework scalable. It also enables integrated design of materials and
their manufacturing process considering multiple feasible discrete process routes. The environment for training an RL agent consists of one or more analytical, empirical, or physics-based models to represent the design steps and various calculations of parameters of interest. The usage of RL agents allows the models to 5 learn from both past design data and computational models making the design framework less reliant on data and more efficient.
[0103]
The various RL agents in the design framework are trained in a sequence opposite to that of the evolution of the material. That is, the RL agent representing the last manufacturing process step is trained first and the rest of the 10 RL agents are trained subsequently step-by-step. While training a given RL agent, the trained RL agents that correspond to the successive steps in the design process, are utilized as learnt predictor models. This type of training makes the agent being trained to be aware of the performance of the succeeding process step agents. This training approach, along with conditional prediction of the models during design 15 stage allows the models to work in tandem in an effective manner to achieve the overall requirements. That is, in the case of a hot rolled steel design, while training the rough rolling RL agent, the succeeding learnt RL agents for finish rolling and ROT & coiling steps are used to predict the designs for the respective process steps. This way, the rough rolling RL agent is aware of the performance of learnt 20 succeeding models. This training method allows the framework to exploit the good performing regions of the RL agents to generate feasible designs for a requirement with limited training.
[0104]
The RL agents have an inherent hierarchy, because the earlier decisions have a large influence on what is possible for the subsequent agents to 25 accomplish. Therefore a "proprietary" or "novel" reward function is designed. The reward provided for a given process step RL agent is composed of two components: a local reward component and global reward component. The local rewards are given based on the agent’s performance in satisfying its process specific goals and constraints while the global reward is given for satisfying the overall requirements 30 (which is dependent on the performance of all the RL agents in the framework).
35
This type of a reward system enables the individual RL models to consider both the
process specific constraints and goals along with final property requirements during decision making emulating the scenario of multiple design experts working together in tandem to achieve the overall goal. For example, in the case of hot rolled steel design, for casting process RL models, the rewards are composed of local rewards 5 dependent on casting process constraints such as constraints on the microstructural state at the end of casting, and a global component which is dependent on the final properties such as YS and UTS achieved by the material designed. In addition, the local and global rewards integrate various desirable parameters such as manufacturability, ESG norms, cost, process energy etc. and their relative 10 importance through a weighted approach leading to the design of manufacturable, cost effective and sustainable materials.
[0105]
A generative design approach of the present disclosure combines two methods to generate multiple material and manufacturing process design alternatives for a given requirement. Multiple design alternatives can be generated 15 for the given requirements for a given processing route and reward weights by (a) allowing the agents to learn on the given requirement condition for multiple learning episodes, and (b) adding a small noise to the individual agent predictions. These methods exploit the reward landscape understanding of the agents and perturbation of the preferred design space respectively. These methods reduce the 20 reliance on individual agent performance levels. It can potentially generate diverse solution alternatives compared to human experts leading to improved design quality and reduced design cycle time.
[0106]
Further, the integrated design framework of the present disclosure augments the current design process by providing multiple design alternatives to 25 the designer that leads to efficient discovery of composition and manufacturing processes and reduced manual efforts and experimental trials.
[0107]
A modular multi-agent setup for automated design, wherein a distinct Reinforcement learning (RL) agent is used to mirror the composition selection (CS) and various process steps (PS) involved in its manufacturing. The 30 distinct RL agents inherit the hierarchy involved in the materials and their
36
manufacturing process design. For example, in the case of a hot rolled steel design,
distinct RL agents are assigned to make decisions on composition selection, casting, reheating, rolling and ROT. This modular architecture provides the following advantages.
a.
increased flexibility and allows to consider multiple processing 5 routes feasible. The same RL agent can be used to make decisions for a processing step common to multiple processing routes.
b.
allows to train distinct RL agents with process specific data available.
[0108]
The methods and systems of the present disclosure represents the 10 composition selection and various process step design problems as RL problems and train the RL agents effectively are proposed allowing the models to learn from both past design data and computational models (empirical/analytical/physics based) representing the design process.
[0109]
Further, the methods and systems of the present disclosure integrate 15 manufacturability, ESG norms, cost, process energy etc. and their relative importance into the design decision making process of the RL agents by expressing them as reward components upon which the RL agents are trained. The method leads to the following benefits:
a.
Reduce the rejection of material designs due to poor 20 manufacturability and failing to satisfy ESG norms.
b.
Cost effective and greener designs.
[0110]
A generative design methodology to generate multiple alternatives of materials and their manufacturing process designs for a given requirement is presented which provides multiple design options to the designer with potentially 25 larger diversity leading to faster convergence and reduced design cycle time.
[0111]
The methods and systems of the present disclosure accelerates the materials design life cycle and reduce the experimental trials required through automated generation of multiple candidate material and manufacturing process designs from various feasible processing routes for a given requirement. The 30 methods and systems of the present disclosure can potentially provide diverse and
37
unexplored solutions of materials and manufacturing process designs as compared
to expert driven decisions, which are often limited by experiential biases or lack of sufficient exploration.
[0112]
The various manufacturability, ESG norms, cost, process energy considerations are incorporated into the decision making, leading to the design of 5 manufacturable, cost effective and sustainable materials and manufacturing processes. The methods and systems of the present disclosure uses RL agents to learn from both past design data and computational models making the design framework less reliant on data, practical, and significantly more efficient than the conventional techniques in the art. 10
Example scenario:
[0113]
The methods and systems of the present disclosure were tested on the design of hot rolled steel sheets. The design process was considered as in FIG. 5B which contained two different casting routes: conventional casting route (Route 1) and thin slab casting route (Route 2). The major difference between the process 15 routes is the presence of a rough rolling process step in conventional casting route. In addition, the feasible variable ranges of various processing steps vary with the processing routes.
[0114]
In this design process, the design input consists of target requirements of yield strength (YS), ultimate tensile strength (UTS), % uniform 20 elongation, and sheet thickness. The objective of the design process was to satisfy the target properties along with minimizing/maximizing the following parameters.
a.
Minimize
i.
Energy functions, Carbon footprint, Cost
b.
Maximize 25
ii.
Weldability, corrosion resistance, fracture toughness
[0115]
In addition, there are several process specific constraints which need to be satisfied. These constraints relate to process equipment constraints, process end state constraints, and microstructural state constraints which were derived from domain knowledge. For example, in the casting process, the grain size and 30 secondary dendritic arm spacing (SDAS) at the end of casting should be less than
38
the respective limits provided.
FIG. 12 shows RL model architecture adopted for hot rolled steel sheet design, in accordance with some embodiments of the present disclosure. The architecture consists of six RL agents which were assigned to composition selection, casting, reheating, rough rolling, finish rolling, and run out table (ROT) & Coiling processes respectively. The rough rolling RL agent was used 5 for conventional casting route only. The casting route was provided as an additional input to train and predict solutions corresponding to a given processing route. The state, actions, and rewards are set up for each of the six decision problems as per the approach presented in the invention. For example, in the case of the Reheating RL agent, the state, actions, and rewards are as given below in line with the state 10 parameters of the reheating process as stated in Table 1, wherein LL means lower limit.
Parameter type
Parameter
Target requirements
LL of YS
LL of UTS
LL of % Elongation
Composition (in %)
C, Mn, Si, Cr, Ni
Casting process design outcomes
Processing route, Casting Grain Size
Table 1
▪
Actions
o
Reheating time 15
o
Reheating temperature
▪
Reward components
o
Local reward (process specific)
▪
Penalty for violating grain size at the end of reheating.
▪
Negative reward proportional to reheating process 20 energy.
o
Global reward (based on final properties achieved)
39
▪
Penalty for not achieving any of the final property targets (function of number of violations and degree of violations)
▪
Positive reward proportional to fracture toughness achieved. 5
[0116]
The variables involved in the design process are predominantly continuous. Considering these aspects, an off-policy RL algorithm: Deep Deterministic Policy Gradient (DDPG) is the RL algorithm selected for this framework. The DDPG was considered as an exemplary implementation, but any RL algorithm may be substituted as appropriate. Empirical models which represent 10 the various processing steps and property evaluations, are used to create the RL model environments for this design problem. The relative weights of different types adopted for the problem based on domain knowledge and experience are provided for the reheating process and the global reward components.
[0117]
The RL agents are trained as per the training approach presented 15 wherein the ROT & coiling RL agent was trained first and the rest of the agents are trained following the hierarchical sequence keeping the downstream agents as learnt predictor models. A training data was created using the approach presented in the invention, wherein the DoE were performed on the design space of the Materials and process design problem to create a first set of data with inputs 20 consisting of various CS/PS variables and outputs consisting of final achieved properties. The first set of training data was transformed into learning experiences by running it through different input requirement sets and calculating the associated rewards. This was combined with online RL training with random start states to train each RL agent. The generative design approach presented was used to generate 25 Materials and process designs using the learnt models for a given requirement. FIG. 13 shows design solutions generated by the RL model architecture adopted for hot rolled steel sheet design of FIG. 12, for a given requirement, in accordance with some embodiments of the present disclosure.
[0118]
To evaluate the performance of the developed framework, a 30 benchmark study was carried out with design experts (materials engineers familiar
40
with the hot rolled steel design). In this
benchmark study, the design experts and the RL agents were given three sets of different design requirements to work upon. A maximum of 10 design iterations can be performed on a given design problem to come up with the best set of CS/PS variable values to achieve the requirements. The same empirical models were used to evaluate the final properties and process 5 outcomes of the various processing steps. The RL agent was asked to generate 5 materials and process designs each corresponding to the two processing routes to add up to 10 materials and process designs for a given problem following the learning and iterating approach in the generative design strategy. A design score is provided for each design based on the various design goals and constraints defined. 10
[0119]
FIG. 14 shows comparison results of the design solutions generated by the RL model architecture adopted for hot rolled steel sheet design of FIG. 12, and the domain expert, in accordance with some embodiments of the present disclosure. As shown in FIG. 14, the comparison shows that the design framework showcased superior performance in terms of design solution performance and 15 design cycle time.
[0120]
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are 20 intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0121]
The embodiments of the present disclosure herein address unresolved problems of automated design of materials and the manufacturing 25 process for the desired properties. The present disclosure integrates other important parameters such as manufacturability, ESG norms, cost, process energy etc. and their relative importance into the design decision making process of the RL agents by expressing them as reward components upon which the RL agents are trained.
[0122]
It is to be understood that the scope of the protection is extended to 30 such a program and in addition to a computer-readable means having a message
41
therein; such computer
-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination 5 thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both 10 hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[0123]
The embodiments herein can comprise hardware and software 15 elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can 20 comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0124]
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. 25 These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, 30 variations, deviations, etc., of those described herein) will be apparent to persons
42
skilled in the relevant art(s) based on the teachings contained herein.
Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such 5 item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0125]
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A 10 computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-15 readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media. 20
[0126]
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims
We Claim:
1. A processor-implemented method (300), comprising the steps of:
receiving, via one or more hardware processors, a historical design data associated to a plurality of materials, from a repository (302);
creating, via the one or more hardware processors, a training dataset of each material of the plurality of materials, from the historical design data, to obtain a plurality of training data sets associated with the plurality of materials, wherein the training data set of each of the plurality of materials comprises (i) an achieved value of each of one or more properties of the material, (ii) a value of each of one or more composition elements present in the material, (iii) a value of each of one or more process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the material (304);
receiving, via the one or more hardware processors, a target value of each of the one or more properties of each material of the plurality of materials (306); and
training, via the one or more hardware processors, a set of reinforcement learning (RL) models, using the plurality of training data sets and the target value of each of the one or more properties of each of the plurality of materials to obtain a set of trained RL models, wherein each RL model in the set of RL models is defined for (i) a composition selection from the one or more composition elements, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes (308).
2. A processor-implemented method (300) as claimed in claim 1, comprising
the steps of:
receiving, via the one or more hardware processors, the target value of each of the one or more properties of a desired material (310); and

passing, via the one or more hardware processors, the target value of each of the one or more properties of the desired material, to the set of trained RL models, to sequentially predict (i) one or more material compositions, and (ii) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, for each of the one or more material compositions, and wherein each of the one or more material compositions comprises the value of each of the one or more composition elements (312).
3. The processor-implemented method (300) as claimed in claim 1, wherein training the set of RL models, using the plurality of training data sets to obtain the set of trained RL models, comprises:
defining a global reward function of the set of RL models, as a weighted function of the one or more properties of the material, and one or more characteristics of the material (308a);
defining a local reward function of each RL model in the set of RL models, as a weighted function of one or more of: (i) one or more sequential manufacturing process step specific constraints, (ii) one or more evaluation metrics, (iii) one or more material structure state constraints, and (iv) the one or more characteristics of the material that are dependent of each sequential manufacturing process step, and wherein the one or more evaluation metrics comprises one or more environment, social, and governance (ESG) norms, one or more economic indices, and one or more manufacturability indices (308b);
defining a training environment of each RL model in the set of RL models, using the local reward function of associated RL model, and by utilizing a set of computational simulation models, wherein the training environment of each RL model determines a next state, the value of the local reward function, and a learning episode completion status, for a given state, based on an action taken by the associated RL model (308c);

transforming each training data set of the plurality of training data sets along with the target value of each of the one or more properties of each material of the plurality of materials, at a time, to determine (i) a state, (ii) the next state, and (iii) the action, (iv) the reward, and (v) the learning episode completion status, of each RL model in the set of RL models (308d); and
iteratively training each RL model in the set of RL models, defined for (i) the composition selection, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, in a reverse sequential order, by utilizing the training environment of the associated RL model and each training dataset present in the plurality of training data sets, to obtain the set of trained RL models (308e).
4. The processor-implemented method (300) as claimed in claim 3, wherein: the state of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to one or more of : (i) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps that are precedent to the corresponding sequential manufacturing process step present in each manufacturing process route, (ii) the value of each of the one or more composition elements, (iii) the target value of each of one or more properties of the material, (iv) the one or more evaluation metrics, and (v) one or more material structure state parameters,
the action of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to the value of each of one or more composition elements present in the material, the value of each of the one or more process parameters of the corresponding sequential manufacturing process step present in each manufacturing process route, respectively, and

the reward of each RL model is defined as a sum of a value of the local reward function of the associated RL model and the value of the global reward function.
5. A system (100) comprising:
a memory (102) storing instructions;
one or more input/output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the
one or more I/O interfaces (106), wherein the one or more hardware
processors (104) are configured by the instructions to:
receive a historical design data associated to a plurality of materials, from a repository;
create a training dataset of each material of the plurality of materials, from the historical design data, to obtain a plurality of training data sets associated with the plurality of materials, wherein the training data set of each of the plurality of materials comprises (i) an achieved value of each of one or more properties of the material, (ii) a value of each of one or more composition elements present in the material, (iii) a value of each of one or more process parameters of each of one or more sequential manufacturing process steps present in each of one or more manufacturing process routes that produces the material;
receive a target value of each of the one or more properties of each material of the plurality of materials; and
train a set of reinforcement learning (RL) models, using the plurality of training data sets and the target value of each of the one or more properties of each of the plurality of materials to obtain a set of trained RL models, wherein each RL model in the set of RL models is defined for (i) a composition selection from the one or more composition elements, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes.

6. The system (100) as claimed in claim 5, wherein the one or more hardware
processors (104) are configured to:
receive the target value of each of the one or more properties of a desired material; and
pass the target value of each of the one or more properties of the desired material, to the set of trained RL models, to sequentially predict (i) one or more material compositions, and (ii) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, for each of the one or more material compositions, and wherein each of the one or more material compositions comprises the value of each of the one or more composition elements.
7. The system (100) as claimed in claim 5, wherein the one or more hardware
processors (104) are configured to train the set of RL models, using the
plurality of training data sets to obtain the set of trained RL models, by:
defining a global reward function of the set of RL models, as a weighted function of the one or more properties of the material, and one or more characteristics of the material;
defining a local reward function of each RL model in the set of RL models, as a weighted function of one or more of: (i) one or more sequential manufacturing process step specific constraints, (ii) one or more evaluation metrics, (iii) one or more material structure state constraints, and (iv) the one or more characteristics of the material that are dependent of each sequential manufacturing process step, and wherein the one or more evaluation metrics comprises one or more environment, social, and governance (ESG) norms, one or more economic indices, and one or more manufacturability indices;
defining a training environment of each RL model in the set of RL models, using the local reward function of associated RL model, and by utilizing a set of computational simulation models, wherein the training

environment of each RL model determines a next state, the value of the local reward function, and a learning episode completion status, for a given state, based on an action taken by the associated RL model;
transforming each training data set of the plurality of training data sets along with the target value of each of the one or more properties of each material of the plurality of materials, at a time, to determine (i) a state, (ii) the next state, and (iii) the action, (iv) the reward, and (v) the learning episode completion status, of each RL model in the set of RL models; and
iteratively training each RL model in the set of RL models, defined for (i) the composition selection, and (ii) each of the one or more sequential manufacturing process steps present in each of the one or more manufacturing process routes, in a reverse sequential order, by utilizing the training environment of the associated RL model and each training dataset present in the plurality of training data sets, to obtain the set of trained RL models.
8. The system (100) as claimed in claim 7, wherein:
the state of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to one or more of : (i) the value of each of the one or more process parameters of each of the one or more sequential manufacturing process steps that are precedent to the corresponding sequential manufacturing process step present in each manufacturing process route, (ii) the value of each of the one or more composition elements, (iii) the target value of each of one or more properties of the material, (iv) the one or more evaluation metrics, and (v) one or more material structure state parameters,
the action of each RL model of the composition selection, and each sequential manufacturing process step, is defined with respect to the value of each of one or more composition elements present in the material, the value of each of the one or more process parameters of the corresponding

sequential manufacturing process step present in each manufacturing process route, respectively, and
the reward of each RL model is defined as a sum of a value of the local reward function of the associated RL model and the value of the global reward function.

Documents

Application Documents

#	Name	Date
1	202421016615-STATEMENT OF UNDERTAKING (FORM 3) [07-03-2024(online)].pdf	2024-03-07
2	202421016615-REQUEST FOR EXAMINATION (FORM-18) [07-03-2024(online)].pdf	2024-03-07
3	202421016615-FORM 18 [07-03-2024(online)].pdf	2024-03-07
4	202421016615-FORM 1 [07-03-2024(online)].pdf	2024-03-07
5	202421016615-FIGURE OF ABSTRACT [07-03-2024(online)].pdf	2024-03-07
6	202421016615-DRAWINGS [07-03-2024(online)].pdf	2024-03-07
7	202421016615-DECLARATION OF INVENTORSHIP (FORM 5) [07-03-2024(online)].pdf	2024-03-07
8	202421016615-COMPLETE SPECIFICATION [07-03-2024(online)].pdf	2024-03-07
9	Abstract1.jpg	2024-04-10
10	202421016615-FORM-26 [08-05-2024(online)].pdf	2024-05-08
11	202421016615-Proof of Right [10-07-2024(online)].pdf	2024-07-10
12	202421016615-Power of Attorney [11-04-2025(online)].pdf	2025-04-11
13	202421016615-Form 1 (Submitted on date of filing) [11-04-2025(online)].pdf	2025-04-11
14	202421016615-Covering Letter [11-04-2025(online)].pdf	2025-04-11
15	202421016615-FORM-26 [22-05-2025(online)].pdf	2025-05-22