“Reinforcement Learning Model To Optimize Steel Coil Fabrication”

< Back

“Reinforcement Learning Model To Optimize Steel Coil Fabrication”

Abstract: The present disclosure relates to system and method for reinforcement learning model to optimize steel coil fabrication. The method includes receiving mechanical properties to be achieved for a steel coil, and one or more control parameters for each operation of a plurality of operations implemented during the steel coil fabrication. The method includes determining, based on the one or more control parameters, an optimal action for each control parameter of the one or more control parameters, such that, the optimal action indicates an action required to achieve the mechanical properties of the steel coil. The method includes implementing machine learning (ML) model to create an environment such that learning agent associated with the environment is configured to: learn about the optimal action for each of the one or more control parameters and control each operation of the plurality of operations of the steel coil fabrication, via the learning agent.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

30 December 2023

Publication Number

32/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

TATA STEEL LIMITED

Jamshedpur – 831 001, Jharkhand, India

Inventors

1. Sudipto Trivedy

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

2. Devraj Ranjan

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

3. Rajesh Shyam Pais

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

4. Amit Kumar Chatterjee

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

5. Mohseen Azad Kadarbhai

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

6. Biswajit Ghosh

C/o., TATA STEEL LIMITED, Jamshedpur – 831 001, Jharkhand, India

Specification

TECHNICAL FIELD
[001] The present disclosure relates to a steel coil fabrication. More particularly,
the present disclosure relates to a system and a method for implementing a
5 reinforcement learning model to optimize the steel coil fabrication.
BACKGROUND OF THE INVENTION
[002] The following description includes information that may be useful in
understanding the present invention. It is not an admission that any of the
information provided herein is prior art or relevant to the presently claimed
10 invention, or that any publication specifically or implicitly referenced is prior art.
[003] Steel may be generally manufactured based on a composition of multiple
materials (such as, but not limited to, carbon, manganese, etc.), which may be
subjected to multiple operations (such as, but a casting operation, a hot rolling
operation, a cold rolling operation, etc.). Based on such operations, a steel product
15 (such as, a steel coil) may be manufactured. In certain cases, there may be a
requirement to enhance a specific mechanical property (such as, a yield strength)
of the steel product. In such cases, it may be difficult and time consuming for an
operator to manually determine suitable material composition and corresponding
operation condition to produce the steel product with improved mechanical
20 property.
[004] The present disclosure is directed to overcome one or more limitations
stated above or any other limitations associated with the prior art.
25
3
SUMMARY OF THE INVENTION
[005] The following presents a simplified summary to provide a basic
understanding of some aspects of optimizing the steel coil fabrication. This
5 summary is not an extensive overview and is intended to neither identify key or
critical elements nor delineate the scope of such elements. Its purpose is to present
some concepts of the described features in a simplified form as a prelude to the
more detailed description that is presented later.
[006] An exemplary aspect of the disclosure may provide a method to optimize
10 mechanical properties of a steel coil. The method may include receiving
mechanical properties to be achieved for the steel coil during a steel coil
fabrication. The method may further include receiving one or more control
parameters for each operation of a plurality of operations implemented during the
steel coil fabrication. The method further includes determining, based on the one
15 or more control parameters, an optimal action for each control parameter of the
one or more control parameters, such that, the optimal action indicates an action
required to achieve the mechanical properties of the steel coil. The method may
further include implementing a machine learning (ML) model to create an
environment such that a learning agent associated with the environment is
20 configured to learn about the optimal action for each control parameter of the one
or more control parameters and control each operation of the plurality of operations
of the steel coil fabrication, via the learning agent.
[007] Another exemplary aspect of the disclosure may provide a system to
optimize mechanical properties of a steel coil. The system may include a
25 processor, and the processor may configure to receive mechanical properties to
4
be achieved for the steel coil during a steel coil fabrication; receive one or more
control parameters for each operation of a plurality of operations implemented
during the steel coil fabrication. The processor may be further configured to
determine, based on the one or more control parameters, an optimal action for
5 each control parameter of the one or more control parameters, such that, the
optimal action indicates an action required to achieve the mechanical properties of
the steel coil. The processor may be further configured to implement a machine
learning (ML) model to create an environment such that a learning agent
associated with the environment is configured to: learn about the optimal action for
10 each control parameter of the one or more control parameters and control each
operation of the plurality of operations of the steel coil fabrication, via the learning
agent.
[008] It is to be understood that the aspects and embodiments of the disclosure
described above may be used in any combination with each other. Several of the
15 aspects and embodiments may be combined to form a further embodiment of the
disclosure.
[009] The above summary is provided merely for the purpose of summarizing
some example embodiments to provide a basic understanding of some aspects of
the disclosure. Accordingly, it will be appreciated that the above-described
20 embodiments are merely examples and should not be construed to narrow the
scope or spirit of the disclosure in any way. It will be appreciated that the scope of
the disclosure encompasses many potential embodiments in addition to those here
summarized, some of which will be further described below.
25
5
OBJECTS OF THE INVENTION
[010] An object of the present disclosure is to develop a Reinforcement Learning
(RL) based framework to optimize the mechanical properties of steel using
reinforcement learning techniques.
5 [011] Another object of the present disclosure is to produce different grades of
steel, each grade having certain properties like Ultimate Tensile Strength (UTS),
Elongation Strength (ES) and Yield Strength (YS) and process control to obtain
the desired properties in a steel product to be achieved using the RL model.
10 EFFECTS/ADVANTAGES OF THE PRESENT INVENTION
[012] The RL-based framework has several advantages over the existing
methods.
[013] A main advantage of the present disclosure is that the RL-based
framework is configured to automatically determine suitable material composition
15 and corresponding operation condition to produce a steel product with improved
mechanical property.
[014] Another advantage of the present disclosure is that the RL-based
framework reduces a count of trials required to achieve optimal mechanical
property, resulting in reduced wastage of raw material and energy consumption.
20 [015] Yet another advantage of the present disclosure is that it improves the
efficiency of the manufacturing process and reduces the overall cost of production.
[016] Yet another advantage of the present disclosure is that the RL-based
framework operates accurately in an unknown regime of operations, where data
samples associated with the steel fabrication are limited.
25
6
BRIEF DESCRIPTION OF THE DRAWINGS
[017] The accompanying drawings, which are incorporated in and constitute a
part of this disclosure, illustrate exemplary embodiments and, together with the
description, explain the disclosed principles.
5 [018] FIG. 1 illustrates an exemplary environment of a RL-based framework for
implementing embodiments consistent with the present disclosure.
[019] FIG. 2 illustrates a tabular diagram of a Q-Matrix in accordance with the
present disclosure.
[020] FIG. 3 illustrates a block diagram of a system to optimize mechanical
10 properties of a steel coil, in accordance with the present disclosure.
[021] FIG. 4A illustrates a flow diagram that illustrates a method flow to form the
RL-based framework of FIG. 1, in accordance with an embodiment of the present
disclosure.
[022] FIG. 4B illustrates an exemplary scenario that illustrates a real-time
15 implementation of the RL framework of FIG. 1, in accordance with an embodiment
of the present disclosure.
DETAILED DESCRIPTION
[023] Exemplary embodiments are described with reference to the
20 accompanying drawings. Wherever convenient, the same reference numbers are
used throughout the drawings to refer to the same or like parts. While examples
and features of disclosed principles are described herein, modifications,
adaptations, and other implementations are possible without departing from the
spirit and scope of the disclosed embodiments. It is intended that the following
25 detailed description be considered as exemplary only, with the true scope and spirit
7
being indicated by the following claims. Additional illustrative embodiments are
listed below.
[024] In an embodiment, the present disclosure is directed towards a RL-based
framework to optimize the mechanical property of coils using reinforcement
5 learning (RL) techniques.
[025] FIG. 1 illustrates an exemplary environment of an RL-based framework for
implementing embodiments consistent with the present disclosure. With reference
to FIG. 1, there is shown a RL-based framework (100). The RL-based framework
(100) may include a Machine Learning (ML) model (102) and a Reinforcement
10 Learning (RL) optimizer (104). In an embodiment, at least one of: the ML model
(102) or the RL optimizer (104), may be stored in a Q-matrix (106).
[026] The RL-based framework (100) may include a learning agent that may be
configured to operate in an environment to control a mechanical property (for
example, a yield stress, an ultimate tensile stress, or an elongation percentage) of
15 coils (for example, a steel coil). For example, the learning agent may be configured
to move from one operational state (for example, a casting operation) to another
operational state (for example, a hot rolling operation) to learn information
associated with such operational states and achieve a goal of attaining the
mechanical property of the steel coil.
20 [027] The ML model (102) may be formed based on a historical data of material
parameters and process parameters to predict mechanical properties of the steel
coil, such as, but not limited to, an ultimate tensile strength (UTS), a yield stress
(YS), or an elongation (EL). In an example, the material parameters may include,
but not limited to, a manganese percentage (Mn%), a carbon percentage (C%), a
25 niobium percentage (Nb%), or a phosphorus percentage (P%), a silicon
8
percentage (Si%), a titanium percentage (Ti%), or a nitrogen (N%) percentage. In
another example, the process parameters may include, but not limited to, a
temperature control, a force control, or a speed control of corresponding
operational state.
5 [028] The ML model (102) may include one of: a classifier or a regression or a
clustering model, which may be trained to identify a relationship between inputs,
such as features in a training dataset and output labels, such as a distance to reach
an operational goal and corresponding reward or penalty associated with such
distance. The ML model (102) may be defined by its hyper-parameters, for
10 example, number of weights, cost function, input size, number of layers, and the
like. The parameters of the ML model (102) may be tuned, and weights may be
updated to move towards a global minimum of a cost function for the ML model
(102). After several epochs of the training on the feature information in the training
dataset, the ML model (102) may be trained to output a prediction/classification
15 result for a set of inputs. The prediction result may be indicative of a class label for
each input of the set of inputs (e.g., input features extracted from new/unseen
instances).
[029] The ML model (102) may include electronic data, which may be
implemented as, for example, a software component of an application executable
20 on the RL-based framework (100). The ML model (102) may rely on libraries,
external scripts, or other instructions for execution by a processing device, such
as a processor (shown in FIG. 2). The ML model (102) may include code and
routines configured to enable a computing device, such as the processor to
perform one or more operations, such as, to create the environment such that the
25 learning agent associated with the environment shall be configured to: learn about
9
optimal action for each control parameter of one or more control parameters
associated with a plurality of operations of the steel coil fabrication, and control
each operation of plurality of operations of steel coil fabrication, via learning agent.
Alternatively, the ML model (102) may be implemented using hardware including
5 the processor, a field-programmable gate array (FPGA), or an application-specific
integrated circuit (ASIC). Alternatively, in some embodiments, the ML model (102)
may be implemented using a combination of hardware and software. Further
details of the ML model (102) have been omitted from the disclosure for the sake
of brevity.
10 [030] In an embodiment, the learning agent associated with the ML model (102)
may be configured to perceive and interpret its environment, performs the set of
actions {A11 A12..A14} {A21 A22..A24}…{A41 A42 …A44} (combinedly referred as A11-
A44) and learns through a trial and error in the environment of a steel coil
fabrication. The ML model (102) may be initially trained based on the set of actions
15 performed for each operation and their corresponding results. Once the ML model
(102) is trained, it can be implemented for optimizing the mechanical properties of
the steel coil.
[031] Based on the interpretation and learning, the learning agent may be
configured to control each operational states of the steel coil fabrication and
20 achieves the mechanical property of the steel coil, via the RL optimizer (104).
[032] The RL optimizer (104) may be configured to suggest a current action to
an operator to control the material parameter or the process parameter, to achieve
the optimal mechanical property of the steel coil. The learning agent associated
with the RL optimizer (104) may be configured to execute a control of the steel coil
25 fabrication, based on rewarding desired behaviors and/or punishing undesired
10
ones. In an embodiment, the learning agent is able to perceive and interpret its
environment, take actions and learn through trial and error. For example, the
learning agent may communicably interact with a plurality of operations (S1 – S4)
associated with the steel coil fabrication to learn operational states that may be
5 required to form the steel coil. Based on instructions from the RL optimizer (104),
the learning model may suggest a suitable action from a set of actions (A11-A44) for
each operation of the plurality of operations (S1 – S4). In an example, executing an
action in a specific operation provides the learning agent with a reward. The
learning agent may assign positive rewards to the desired actions to encourage
10 the learning agent. The learning agent may also assign negative rewards to
undesired behaviors. Based on such rewards, the learning agent learns to seek
long-term and maximum overall reward to achieve the optimal mechanical property
for the steel coil.
[033] The Q-matrix (106) is a database, which may be configured to store
15 information associated with the plurality of operations (S1 – S4) and the
set of actions (A11-A44), in the RL-based framework (100). For example, the Qmatrix (106) may store each optimal action “A12”, “A23”, “A34” and “A43” selected
for their respective operations S1-S4. The Q-matrix (106) may further store
information associated with all operations of the plurality of operations S1-S4, to
20 optimize the mechanical properties of the steel coil. Description of the Q-matrix
(106) is further explained, for example, in FIG. 2.
[034] FIG. 2 illustrates a tabular diagram of a Q-Matrix in accordance with the
present disclosure. FIG. 2 is explained in conduction with FIG. 1. With reference
to FIG. 2, there is shown a tabular diagram (200), which depicts the Q-matrix (106).
11
[035] To train the RL-based framework (100), the learning agent takes an action
(At) from the current operation (St) in the environment, to move to the next
operation (St+1). Thus, the reward may result in taking the action in environment is
denoted as rt. Such rewards help the model to learn whether the action taken by
5 the agent is desired or not. These variables may be used to update the Q-value at
each operation and action using Bellman equation shown below:
[036] In an embodiment, information associated with a learning rate or a step
size determines to what extent newly acquired information overrides old
10 information. A factor of 0 makes the agent learn nothing (exclusively exploiting
prior knowledge), while a factor of 1 makes the agent consider only the most recent
information (ignoring prior knowledge to explore possibilities). In another
embodiment, a discount factor may determine an importance of future rewards. A
factor of 0 may make the agent "myopic" (or short-sighted) by only considering
15 current rewards, while a factor approaching 1 may make it strive for a long-term
high reward. If the discount factor meets or exceeds 1, the action values may
diverge.
[037] During the training of the RL-based framework (100), the learning agent
takes an action(at) from the current operation (st) in the environment, to move to
20 the next operation(st+1). The reward may result in taking the action in environment
is denoted as rt. These variables are stored to a replay buffer or memory. The
12
learning agent continues to take actions till it reaches refined boundaries or optimal
range. For every 100th step a small sample from replay buffer is taken to update
the q-values using Bellman equation and train the main neural network with the
updated q-values. The process is continued for few episodes to let the RL-based
5 framework (100) to explore different possible operations and learn about its qvalues.
[038] The learning agent may use a reinforcement learning technique to
optimize the target variables, via a Deep Q-Network (DQN) algorithm. The DQN
algorithm use Q-learning to learn a best action to take in the given operation and
10 a convolutional neural network to estimate a Q value function. The input for the
neural network is the operation in the environment and its output is the q-values
for all the set of actions. The action with least q-value is selected as the
recommended action for that operation.
[039] In DQN algorithm, two neural networks may be used to ensure stable
15 learning process. The first one is called the main neural network. The second one
is the target neural network, and it may have the exact same architecture as the
main network. All the learning takes place in the main network. The target network
is frozen (its parameters are left unchanged) and then the weights of the main
network are copied into the target network at the end of each episode, thus
20 transferring the learned knowledge from one to the other.
[040] In an alternative embodiment, the RL-based framework (100) may be
configured to optimize the steel coil fabrication with different grades. Each grade
possesses certain properties, such as Ultimate Tensile Strength (UTS), Elongation
Strength (ES) and Yield Strength (YS). In an example, the RL-based framework
13
(100) is configured to control certain material parameters and/or process
parameters to achieve the desired mechanical properties of the steel coil.
[041] In operation, the RL-based framework (100) learns the current action by
the process of reward and penalty. The reward is assigned when the suggested
5 action is correct, and the penalty is assigned when the action is incorrect. The RLbased framework (100) continuously learns from the actions taken by the operator
and updates its model to suggest better actions for achieving optimal strength. In
the training process, the RL-based framework (100) identifies the best action for a
particular operation for which the reward may be maximum (using Bellman
10 Equation) and stores it against that operation in the Q-matrix (106). After the
training process, the Q-matrix (106) is frozen.
[042] Based on the completion of the training and freezing the Q-matrix (106),
the information in the Q-matrix (106) may be compared to generate suggestions
for the optimal actions to achieve the optimize the steel coil fabrication, via the RL
15 optimizer (104). During implementation of the trained Q-matrix (106), the operator
may be required to input certain parameters, such as a desired mechanical
property for the steel coil, and/or a plurality of operations required to be performed
in the steel coil fabrication. In an embodiment, the mechanical properties are
selected from one of: an Ultimate Tensile Strength (UTS), a Yield Strength (YS),
20 or an Elongation (EL), and the plurality of operations is selected from one of: a
casting operation, a hot rolling operation, a cold rolling operation, an annealing
operation, or a galvanizing operation. Further details on such operations are
omitted from the disclosure for the sake of brevity. Based on the input data, the
RL-based framework (100) may be configured to retrieve relevant information from
25 the frozen Q-matrix (106), and output optimal actions (for example, a material
14
control, or a process control) required to achieve the desired mechanical property
of the steel coil.
[043] FIG. 3 illustrates a block diagram of a system to optimize mechanical
properties of a steel coil, in accordance with the present disclosure. FIG. 3 is
5 explained in conduction with FIG. 1 and FIG. 2. With reference to FIG. 1, there is
shown a block diagram (300), depicting a system (302). The system (302) may
include a processor (304), a memory (306), an Input/Output (I/O) Interface (308),
and a network interface (310). All components of the system (302) may be
communicably coupled with each other, via a communication network (312).
10 [044] The system (302) may include suitable logic, circuitry, and interfaces that
may be configured to optimize mechanical properties of the steel coil. Examples of
the system (302) may include, but are not limited to, a computing device, a
smartphone, a cellular phone, a mobile phone, a gaming device, a mainframe
machine, a computer workstation, and/or a consumer electronic (CE) device.
15 [045] In an alternate embodiment, the system (302) may be a server stored in a
cloud environment. The server may include suitable logic, circuitry, and interfaces,
and/or code that may be configured to optimize mechanical properties of the steel
coil. The server may be implemented as a cloud server and may execute
operations through web applications, cloud applications, HTTP requests,
20 repository operations, file transfer, and the like. Other example implementations of
the server may include, but are not limited to, a database server, a file server, a
web server, a media server, an application server, a mainframe server, or a cloud
computing server.
In at least one embodiment, the server may be implemented as a plurality of
25 distributed cloud-based resources by use of several technologies that are well
15
known to those ordinarily skilled in the art. A person with ordinary skill in the art will
understand that the scope of the disclosure may not be limited to the
implementation of the server and the system (302) as two separate entities. In
certain embodiments, the functionalities of the server can be incorporated in its
5 entirety or at least partially in the system (302), without a departure from the scope
of the disclosure.
[046] The processor (304) may include suitable logic, circuitry, and interfaces
that may be configured to execute program instructions associated with different
operations to be executed by the system (302), The processor (304) may include
10 one or more specialized processing units. In an embodiment, the one or more
specialized processing units may be implemented as an integrated processor or a
cluster of processors that perform the functions of the one or more specialized
processing units, collectively. The processor (304) may be implemented based on
a number of processor technologies known in the art. Examples of
15 implementations of the processor (304) may be an X86-based processor, a
Graphics Processing Unit (GPU), a Reduced Instruction Set Computing (RISC)
processor, an Application-Specific Integrated Circuit (ASIC) processor, a Complex
Instruction Set Computing (CISC) processor, a microcontroller, a central
processing unit (CPU), and/or other processors.
20 [047] The processor (304) is configured to receive mechanical properties to be
achieved for the steel coil during a steel coil fabrication. The mechanical properties
to be achieved may comprises an Ultimate Tensile Strength (UTS), a Yield
Strength (YS), or an Elongation (EL) and their corresponding values. For example,
the value corresponding to the UTS will be 465 megapascals (MPa), the value
25 corresponding to YS will be 350MPa, and the value corresponding to the EL will
16
be 32MPa, those of ordinary skilled in the art will appreciate that the aforesaid
mechanical properties and their values are merely an example, and therefore, the
present disclosure is not limited with the said values.
[048] The processor (304) is further configured to receive one or more control
5 parameters for each operation of a plurality of operations implemented during the
steel coil fabrication. Based on the one or more control parameters, the processor
(304) determines an optimal action for each control parameter of the one or more
control parameters, such that, the optimal action indicates an action required to
achieve the mechanical properties of the steel coil. The processor (304) is further
10 configured to implement a machine learning (ML) model to create an environment
such that a learning agent associated with the environment is configured to: learn
about the optimal action for each control parameter of the one or more control
parameters and control each operation of the plurality of operations of the steel
coil fabrication, via the learning agent.
15 [049] The memory (306) may include suitable logic, circuitry, and interfaces that
may be configured to store the one or more instructions to be executed by the
processor (304). In an embodiment, the memory (306) may be configured to store
information associated with the controllable parameters like material control
parameter or a process control parameter. According to an embodiment, the
20 control parameters may be prestored in the memory (306) or may be received from
an external source or a device. In an embodiment, the memory (306) may store
the ML model (102) and/or the RL optimizer (104) to learn and implement about
optimizing the mechanical properties of the steel coil. The memory (306) may also
store the Q-matrix (106) along with the optimal actions for each of the control
25 parameters. In some examples, the memory (306) may represent any type of non-
17
transitory computer readable medium such as random-access memory (RAM),
read only memory (ROM), magnetic disk or tape, optical disk, flash memory, or
holographic memory. In an embodiment, the memory (306) may include a
combination of random-access memory and read only memory and may include
5 data/instructions related to processing of one or more components of the system
(302).
[050] In an embodiment, the memory (306) may include data and one or more
modules (not shown in Figures. The one or more modules may be configured to
perform the steps of the present disclosure using the data, to optimize mechanical
10 properties of a steel coil. In an embodiment, each of the one or more modules may
be a hardware unit which may be outside the memory (306) and coupled with the
processor (304). As used herein, the term “modules” refers to an Application
Specific Integrated Circuit (ASIC), an electronic circuit, a Field-Programmable
Gate Arrays (FPGA), Programmable System-on-Chip (PSoC), a combinational
15 logic circuit, and/or other suitable components that provide described functionality.
The one or more modules when configured with the described functionality defined
in the present disclosure will result in a novel hardware. In one implementation, the
modules may include, for example, a communication module, a modification
module, and other modules. It will be appreciated that such modules may be
20 represented as a single module or a combination of different modules. In one
implementation, the data may include, for example, communication data,
modification data, and other data.
[051] The I/O Interface (308) may include suitable logic, circuitry, and interfaces
that may be configured to receive an input (such as, a desired mechanical
25 properties of the steel coil, or a control parameter such as a material composition,
18
a process speed, a process temperature, etc. that may be required to perform the
steel coil fabrication) from a user and provide an output (such as, the optimal action
required for the steel coil fabrication to achieve the desired mechanical property)
based on the received input. The I/O interface (308) may include various input and
5 output devices, may be configured to communicate with the processor (304). For
example, the system (302) may receive a user input via the I/O interface (308) and
compare the received user input against the frozen Q-matrix (106) to determine
the optimal action required for the steel coil fabrication to achieve the desired
mechanical property. The I/O interface (308), such as a display may render inputs
10 and/or outputs of the ML model (102) and/or the RL optimizer (104) before or after
the Q-matrix (106) of the RL-based framework (100) trained. Examples of the I/O
interface (308) may include, but are not limited to, a touch screen, a display device,
a keyboard, a mouse, a joystick, a microphone, or a speaker.
[052] The network interface (310) may include suitable logic, circuitry, and
15 interfaces that may be configured to facilitate communication between the system
(302) and a steel coil fabrication plant (314), via the communication network (312).
For example, the system (302) may receive the input (such as, a desired
mechanical properties of the steel coil, or a control parameter such as a material
composition, a process speed, a process temperature, etc. that may be required
20 to perform the steel coil fabrication) from the user and output the optimal actions
(such as, actions associated with material control parameters (314A) of the steel
coil fabrication plant (314), or actions associated with process control parameters
(314B) of the steel coil fabrication plant (314), via the network interface (310).
[053] In an embodiment, the material control parameters (314A) may
25 correspond to a control percentage of materials for each operation of the plurality
19
of operations (such as, the casting operation, the hot rolling operation, the cold
rolling operation, the annealing operation, or the galvanizing operation)
implemented during the steel coil fabrication. The control percentage of materials
may be selected from one of: a manganese percentage (Mn%), a carbon
5 percentage (C%), a niobium percentage (Nb%), or a phosphorus percentage (P%)
[054] In another embodiment, the process control parameters (314B) may
correspond to a control of various processes associated with each operation of the
plurality of operations (such as, the casting operation, the hot rolling operation, the
cold rolling operation, the annealing operation, or the galvanizing operation)
10 implemented during the steel coil fabrication. In an embodiment, the process
control parameters (314B) may be selected from one of: a temperature control, a
force control, or a speed control, to achieve the desired mechanical property in the
steel coil fabrication.
[055] The network interface (310) may be implemented by use of various known
15 technologies to support wired or wireless communication of the system (302) with
the steel coil fabrication plant (314), via the communication network (312). The
network interface (310) may include, but is not limited to, an antenna, a radio
frequency (RF) transceiver, one or more amplifiers, a tuner, one or more
oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, a
20 subscriber identity module (SIM) card, or a local buffer circuitry. The network
interface (310) may be configured to communicate via wireless communication
with networks, such as the Internet, an Intranet, or a wireless network, such as a
cellular telephone network, a wireless local area network (LAN), and a metropolitan
area network (MAN). The wireless communication may be configured to use one
25 or more of a plurality of communication standards, protocols and technologies,
20
such as Global System for Mobile Communications (GSM), Enhanced Data GSM
Environment (EDGE), wideband code division multiple access (W-CDMA), Long
Term Evolution (LTE), 5G NR, code division multiple access (CDMA), time division
multiple access (TDMA), Bluetooth, Wireless Fidelity (Wi-Fi) (such as IEEE
5 802.11a, IEEE 802.11b, IEEE 802.11g or IEEE 802.11n), voice over Internet
Protocol (VoIP), light fidelity (Li-Fi), Worldwide Interoperability for Microwave
Access (Wi-MAX), a protocol for email, instant messaging, and a Short Message
Service (SMS)
[056] The communication network (312) may include a communication medium
10 through which the system (302) and the steel coil fabrication plant (314), may
communicate with each other. The communication network (312) may be one of a
wired connection or a wireless connection Examples of the communication
network (312) may include, but are not limited to, the Internet, a cloud network,
Cellular or Wireless Mobile Network (such as Long-Term Evolution and 5G New
15 Radio), a Wireless Fidelity (Wi-Fi) network, a Personal Area Network (PAN), a
Local Area Network (LAN), or a Metropolitan Area Network (MAN). Various
devices in the RL-based framework (100) and/or the system (302) may be
configured to connect to the communication network (312) in accordance with
various wired and wireless communication protocols. Examples of such wired and
20 wireless communication protocols may include, but are not limited to, at least one
of a Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram
Protocol (UDP), Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP),
Zig Bee, EDGE, IEEE 802.11, light fidelity (Li-Fi), 802.16, IEEE 802.11s, IEEE
802.11g, multi-hop communication, wireless access point (AP), device to device
21
communication, cellular communication protocols, and Bluetooth (BT)
communication protocols.
[057] In operation, the processor (304) may be configured to receive the
mechanical properties to be achieved for the steel coil during a steel coil
5 fabrication. The processor (304) may further receive the one or more control
parameters (such as the material control parameter (314A) or the process control
parameter (314B) for each operation of a plurality of operations implemented
during the steel coil fabrication. The processor (304) may determine, based on the
one or more control parameters, the optimal action for each control parameter of
10 the one or more control parameters, such that, the optimal action indicates the
action required to achieve the mechanical properties of the steel coil. Based on the
determination, the processor (304) may be configured implement a machine
learning (ML) model to create an environment such that the learning agent
associated with the environment may learn about the optimal action for each
15 control parameter of the one or more control parameters. Based on the learning,
the learning agent may and control each operation of the plurality of operations of
the steel coil fabrication.
[058] In an embodiment, the processor (304) stores, the optimal action in the Qmatrix (106) of the memory (306). In an embodiment, the optimal action may be
20 stored for each control parameter of the one or more control parameter, which
corresponds to each operation of the plurality of operations. Based on an addition
of training data, the processor (304) updates the stored Q-matrix (106) in an
iterative manner based on the optimal action determined for corresponding
operation of the plurality of operations.
22
[059] In another embodiment, the processor (304) controls, based on the optimal
action, the learning agent to independently control each control parameter of the
one or more control parameter, for each operation of the plurality of operations to
achieve the mechanical properties of the steel coil during the steel coil fabrication.
5 Details of such control is described further, for example, in FIG. 4B.
[060] In yet another embodiment, the processor (304) implements an interpreter
element in the environment of the ML model (102), wherein the interpreter element
is configured to train the learning agent, based on a transfer of at least one of: a
reward value or a penalty value, from the interpreter element to the learning agent;
10 and control, based on one of: the optimal action, the reward value, or the penalty
value, the learning agent to independently control each control parameter of the
one or more control parameter, for each operation of the plurality of operations to
achieve the mechanical properties of the steel coil during the steel coil fabrication.
Details of the implementation of the interpreter element is described further, for
15 example, in FIG. 4B.
[061] FIG. 4A illustrates a flow diagram that illustrates a method flow of RL
framework of FIG. 1, in accordance with an embodiment of the present disclosure.
FIG. 4A is described in conjunction with elements from FIG. 1, FIG. 2, and FIG. 3.
With reference to FIG. 4, there is shown a flowchart (400A) that illustrates a
20 construction of the Q-matrix (106) of the RL-based framework (100). The flowchart
(400A) starts at step (402).
[062] At step 402, the method includes receiving mechanical properties to be
achieved for the steel coil during the steel coil fabrication. In an embodiment, the
mechanical properties may include, but not limited to, the Ultimate Tensile Strength
23
(UTS), the Yield Stress (YS) and the Elongation Percentage (%EL), for the steel
coil.
[063] At step 404, the method includes receiving one or more control
parameters for each operation of a plurality of operations implemented during the
5 steel coil fabrication. The plurality of operations may include, but not limited to, the
casting operation, the hot rolling operation, the cold rolling operation, the annealing
operation, or the galvanizing operation. In an embodiment, the one or more control
parameters may include the material control parameters (314A) and/or the process
control parameters (314B).
10 [064] In an embodiment, the material control parameters (314A) may
correspond to a control percentage of materials for each operation of the plurality
of operations (such as, the casting operation, the hot rolling operation, the cold
rolling operation, the annealing operation, or the galvanizing operation)
implemented during the steel coil fabrication. The control percentage of materials
15 may be selected from one of: a manganese percentage (Mn%), a carbon
percentage (C%), a niobium percentage (Nb%), or a phosphorus percentage (P%)
[065] In another embodiment, the process control parameters (314B) may
correspond to a control of various processes associated with each operation of the
plurality of operations (such as, the casting operation, the hot rolling operation, the
20 cold rolling operation, the annealing operation, or the galvanizing operation)
implemented during the steel coil fabrication. In an embodiment, the process
control parameters (314B) may be selected from one of: a temperature control, a
force control, or a speed control, to achieve the desired mechanical property in the
steel coil fabrication.
25
24
[066] At step 406, the method includes determining, based on the one or more
control parameters, the optimal action for each control parameter of the one or
more control parameters, such that, the optimal action indicates an action required
to achieve the mechanical properties of the steel coil. In an embodiment, the
5 optimal action for each control parameter of the one or more control parameter, is
determined based on a Bellman Equation. In an example, the optimal action for
the material control parameters (314A) may include a control percentage of
materials for each operation of the plurality of operations implemented during the
steel coil fabrication. The control of a percentage of materials are selected from
10 one of: a manganese percentage (Mn%), a carbon percentage (C%), a niobium
percentage (Nb%), or a phosphorus percentage (P%). In another example, the
optimal action for the process control parameters (314B) may correspond to a
control of various processes associated with each operation of the plurality of
operations implemented during the steel coil fabrication. The process control
15 parameters (314B) may include, but not limited to, a temperature control, a force
control, or a speed control.
[067] At step 408, the method includes implementing the ML model (102) to
create an environment such that a learning agent associated with the environment
is configured to: learn about the optimal action for each control parameter of the
20 one or more control parameters and control each operation of the plurality of
operations of the steel coil fabrication, via the learning agent.
[068] The RL-based framework (100) associated with the ML model (102) and
the RL optimizer (104), may be configured to simulate the environment with the
learning agent of the RL-based framework (100), using input values of at least one
25 of: the material control parameters (314A), the process control parameters (314B)
25
or environmental parameters (for example, a percentage of silicon, or a percentage
of titanium, in the steel coil fabrication), The material control parameters (314A),
the process control parameters (314B) or the environmental parameters, may be
provided based on the plurality of operations, such as, one or more of: steel
5 making, hot rolling, cold rolling, annealing, or galvanizing operations, which may
be required for the steel coil fabrication.
[069] During simulation, the ML model (102) may predict mechanical properties
of the steel coil i.e., UTS, YS and %EL post simulation, using the Q-matrix (106)
associated with the ML model (102) and the RL optimizer (104). The Q-matrix (106)
10 is trained based on historical trial data and determining suitable actions in terms of
adjusting the process parameters to progress from one process operation to
another process operation, which may result in the desired mechanical properties
for the steel coil. Based on the prediction of the ML model (102) of the RL-based
framework (100), the RL optimizer (104) may determine actions to be taken based
15 on the predicted mechanical properties and the Q-matrix (106) for each process
operation of controllable parameters and may proceed for a subsequent simulation
iteration to progress towards the desired mechanical properties. Therefore, such
iterative simulation process shall be continued, based on the recommended
actions for the at least one of: the material control parameters (314A), the process
20 control parameters (314B) or the environmental parameters, until the predicted
mechanical properties align with the desired mechanical properties for optimizing
mechanical properties of a steel coil.
[070] In an embodiment, the ML model (102) and the RL optimizer (104) may
recommend subsequent actions corresponding to each of the plurality of
25 operations of the subsequent process and applying for a subsequent process
26
implemented for optimizing mechanical properties of a subsequent steel coil using
the Q-matrix (106).
[071] Based on the training and construction of the Q-matrix (106), the Q-matrix
(106) may be configured to be stored in the memory (306). In an example, the
5 optimal action may be stored in the Q-matrix (106) of the memory (306). The
optimal action may be stored for each control parameter of the one or more control
parameter, which may correspond to each operation of the plurality of operations.
Based on addition of training data, the Q-matrix (106) may be updated in an
iterative manner based on the optimal action determined for corresponding
10 operation of the plurality of operations.
[072] The order in which the flowchart (400A) is described is not intended to be
construed as a limitation, and any number of the described method blocks can be
combined in any order to implement the flowchart (400A) or alternate methods.
Additionally, individual blocks may be deleted from the flowchart (400A) without
15 departing from the spirit and scope of the subject matter described herein.
Furthermore, the method can be implemented in any suitable hardware, software,
firmware, or combination thereof.
[073] FIG. 4B illustrates an exemplary scenario that illustrates a real-time
implementation of the RL framework of FIG. 1, in accordance with an embodiment
20 of the present disclosure. FIG. 4B is described in conjunction with elements from
FIG. 1, FIG. 2, FIG. 3, and FIG. 4A. With reference to FIG. 4B, there is shown an
exemplary scenario that illustrates a flowchart (400B) depicting the real-time
implementation of the RL-based framework (100).
[074] Based on the user inputs on the desired mechanical properties, the
25 material control parameters (314A), and the process control parameters (314B),
27
the RL-based framework (100) may control the RL optimizer (104) to suggest a
suitable optimal action. Based on the suggested optimal action, the learning agent
may be controlled to independently control each control parameter of the one or
more control parameter, for each operation of the plurality of operations to achieve
5 the mechanical properties of the steel coil during the steel coil fabrication. In an
embodiment, the processor (304) may control the learning agent to achieve the
mechanical properties of the steel coil.
[075] In an alternate embodiment, the interpreter element may be implemented
in the environment of the ML model (102). The interpreter element may train the
10 learning agent, based on a transfer of at least one of: a reward value or a penalty
value, from the interpreter element to the learning agent. Based on the transfer of
at least one of: the optimal action, the reward value, or the penalty value, the
learning agent may be controlled to independently control each control parameter
of the one or more control parameter, for each operation of the plurality of
15 operations to achieve the mechanical properties of the steel coil during the steel
coil fabrication. In an embodiment, the processor (304) may control the interpreter
element and the learning agent to achieve the mechanical properties of the steel
coil.
[076] In operation, the optimal action for the material control parameters (314A)
20 may include a notification to the operator. In an example, the notification may be
a message prompt on a display (not shown) of the I/O interface (308) associated
with the system (302). The message prompt may include, “Reduce Manganese
percentage by 10%”. In another embodiment, the optimal action for the process
control parameters (31BA) may include a notification to the operator. In an
25 example, the notification may be a message prompt on the display of the I/O
28
interface (308) associated with the system (302). The message prompt may
include, “Increase casting temperature by 200 degrees Celsius”. The message
prompt of the notification is merely an example. Other examples of the notification
may include, but not limited to, an audible notification (such as, an alert via the
5 speakers), a visual notification (such as, color-coded indicators on the display), an
audio-visual notification (such as, a combinatory alert from the display and
speakers), or a tactile notification (such as, an alert via a vibration motor of the I/O
interface (308) associated with the system (302)).
[077] In an embodiment, when a steel of different grade possessing different
10 properties need to be developed, an alteration in the process control is performed.
For instance, the steel in a present suboptimal process state (410) is
recommended for an action by the RL model (412), the current process moves
closer to an optimal state (414) and reaches the optimal state (416). In certain
cases, the optimal state (416) may not be reached. In such cases, if the present
15 state not optimal (418), the reached state may be re-routed to present suboptimal
process state (410).
[078] During the training process, the RL-based framework (100) identifies a
best action for a particular operation for which the reward may be maximum using
Bellman Equation and stores it against that operation in the Q-matrix (106). That
20 is, in real-time, when an input data array comes based on the zone of operation
state, the action may be decided by referring to the frozen Q-matrix (106). By taking
that action, the agent may reach another operation that may be closer to the
optimal target mechanical properties. Similar actions may be taken until the optimal
target is reached.
29
[079] After the training process, the Q-matrix (106) is frozen and in real-time,
when the input data is received, the RL-based framework (100) uses the frozen Qmatrix (106) to suggest an action that may move the system (302) closer to the
optimal target mechanical properties. For instance, in an exemplary embodiment,
5 when an input data array comes based on the zone of operation state, the action
may be decided by referring to the frozen Q-matrix (106). By taking that action, the
learning agent may reach another operation that may be closer to the optimal
target mechanical properties. Similar actions may be taken until the optimal target
is reached. Thus, the process is repeated until the system reaches the target.
10 Hence, the RL-based framework (100) operates accurately in an unknown regime
of operations where data samples are limited, making it a valuable tool for
optimizing the steel coil fabrication environment of a coil.
[080] The RL-based framework (100) has several advantages over the existing
methods. It can operate accurately in an unknown regime of operations where data
15 samples are limited. It reduces the number of trials required to achieve optimal
strength, resulting in reduced wastage of raw material and energy consumption. It
improves the efficiency of the manufacturing process and reduces the overall cost
of production.
[081] In an embodiment, Table 1 provides models for target variables, using
20 which the models are built for the three target variables with all the other
parameters as X’s. The Root Mean Square Error (RMSE) value of validation data
is a metric, required to be within 15 to ensure the models can predict for different
ranges of values. Neural network models performed better than the linear
regression models and provided better RMSE.
25 TABLE 1:
30
[082] In another embodiment consistent with the present disclosure, the RLbased framework (100) is built on one or more control parameters that are selected
from one of: the material control parameter (314A) or the process control
5 parameter (314B) using controllable parameters such as X’s. These models are
used to predict values with each action taken by the controllable parameters and
recommends the final values for the controllable parameters to reach optimum
target values. Random Forest regression models may give better results than the
models built with Ordinary Least Squares (OLS) Regression. Table 2 provides a
10 comparison chart using OLS model with RF model.
TABLE 2:
31
[083] Thus, Random Forest models are used to recommend values for
controllable parameters. Table 3 and Table 4 provide a band-wise data count on
optimum ranges for Target variables and percentage of records moved towards
5 the optimum for each band respectively.
TABLE 3:
TABLE 4:
32
[084] Thus, the value of parameters can be set and shows the recommended
value in the UI interface and that shows the current values for target values and
the action that is recommended to bring the target values to optimum.
[085] Overall, the RL-based framework (100) for simulating the steel coil
5 fabrication environment is a tool comprising hardware implements, which may be
configured to achieve optimal mechanical properties of a coil. It combines machine
learning models and an RL optimizer to suggest actions that move the system
closer to the optimal target mechanical properties. The framework learns from its
actions through a process of reward and penalty and uses the Deep Q-Network
10 (DQN) technique to optimize the target variables. The framework is particularly
valuable in situations where data samples are limited, making it a valuable tool for
optimizing the steel coil fabrication environment of a coil.
[086] The order in which the flowchart (400B) is described is not intended to be
construed as a limitation, and any number of the described method blocks can be
15 combined in any order to implement the flowchart (400B) or alternate methods.
Additionally, individual blocks may be deleted from the flowchart (400B) without
departing from the spirit and scope of the subject matter described herein.
Furthermore, the method can be implemented in any suitable hardware, software,
firmware, or combination thereof.
20 [087] The illustrated steps are set out to explain the exemplary embodiments
shown, and it may be anticipated that ongoing technological development will
change the way particular functions are performed. These examples are presented
herein for purposes of illustration, and not limitation. Further, the boundaries of the
functional building blocks have been arbitrarily defined herein for the convenience
33
of the description. Alternative boundaries can be defined so long as the specified
functions and relationships thereof are appropriately performed.
[088] The foregoing method descriptions and the process flow diagrams are
provided merely as illustrative examples and are not intended to require or imply
5 that the steps of the various embodiments must be performed in the order
presented. As may be appreciated by one of skill in the art the order of steps in the
foregoing embodiments may be performed in any order. Words such as
“thereafter,” “then,” “next,” etc. are not intended to limit the order of the steps; these
words are simply used to guide the reader through the description of the methods.
10 Further, any reference to claim elements in the singular, for example, using the
articles “a,” “an” or “the” is not to be construed as limiting the element to the
singular.
[089] Various embodiments of the present invention are described with
reference to the accompanying drawings, in which some, but not all embodiments
15 of the invention are shown. Indeed, the invention may be embodied in many
different forms and should not be construed as limited to the embodiments set forth
herein. Rather, these embodiments are provided so that this disclosure may satisfy
applicable legal requirements. The term “or” is used herein in both the alternative
and conjunctive sense, unless otherwise indicated. The terms “illustrative,”
20 “example,” and “exemplary” are used to be examples with no indication of quality
level. Like numbers refer to like elements throughout.
[090] The phrases “in an embodiment,” “in one embodiment,” “according to one
embodiment,” and the like generally mean that the feature, structure, or
characteristic following the phrase may be included in at least one embodiment of
25 the present disclosure and may be included in more than one embodiment of the
34
present disclosure (importantly, such phrases do not necessarily refer to the same
embodiment).
[091] The word “exemplary” is used herein to mean “serving as an example,
instance, or illustration.” Any implementation described herein as “exemplary” is
5 not necessarily to be construed as preferred or advantageous over other
implementations.
[092] If the specification operations a component or feature “can,” “may,” “could,”
“should,” “would,” “preferably,” “possibly,” “typically,” “optionally,” “for example,”
“often,” or “might” (or other such language) be included or have a characteristic,
10 that component or feature is not required to be included or to have the
characteristic. Such component or feature may be optionally included in some
embodiments, or it may be excluded.
[093] In some example embodiments, certain ones of the operations herein may
be modified or further amplified as described below. Moreover, in some
15 embodiments additional optional operations may also be included. It should be
appreciated that each of the modifications, optional additions or amplifications
described herein may be included with the operations herein either alone or in
combination with any others among the features described herein.
[094] Many modifications and other embodiments of the inventions set forth
20 herein will come to mind to one skilled in the art to which these inventions pertain
having the benefit of teachings presented in the foregoing descriptions and the
associated drawings. Although the figures only show certain components of the
apparatus and systems described herein, it is understood that various other
components may be used in conjunction with the supply management system.
25 Therefore, it is to be understood that the inventions are not to be limited to the
35
specific embodiments disclosed and that modifications and other embodiments are
intended to be included within the scope of the appended claims. Moreover, the
steps in the method described above may not necessarily occur in the order
depicted in the accompanying diagrams, and in some cases one or more of the
5 steps depicted may occur substantially simultaneously, or additional steps may be
involved. Although specific terms are employed herein, they are used in a generic
and descriptive sense only and not for purposes of limitation.
36
I/We Claim:
1. A method to optimize mechanical properties of a steel coil, the method
comprising:
receiving mechanical properties to be achieved for the steel coil during a
5 steel coil fabrication;
receiving one or more control parameters for each operation of a plurality of
operations implemented during the steel coil fabrication;
determining, based on the one or more control parameters, an optimal
action for each control parameter of the one or more control parameters, such
10 that, the optimal action indicates an action required to achieve the mechanical
properties of the steel coil; and
implementing a machine learning (ML) model (102) to create an
environment such that a learning agent associated with the environment is
configured to:
15 learn about the optimal action for each control parameter of the one or
more control parameters, and
control each operation of the plurality of operations of the steel coil
fabrication, via the learning agent.
20 2. The method as claimed in claim 1, further comprising:
storing the optimal action in a Q-matrix (106), wherein the optimal action is
stored for each control parameter of the one or more control parameter, which
corresponds to each operation of the plurality of operations; and
updating the Q-matrix (106) in an iterative manner based on the optimal
25 action determined for corresponding operation of the plurality of operations.
37
3. The method as claimed in claim 1, further comprising:
controlling, based on the optimal action, the learning agent to
independently control each control parameter of the one or more control
parameter, for each operation of the plurality of operations to achieve the
5 mechanical properties of the steel coil during the steel coil fabrication.
4. The method as claimed in claim 1, further comprising:
implementing an interpreter element in the environment of the ML model
(102), wherein the interpreter element is configured to train the learning agent,
10 based on a transfer of at least one of: a reward value or a penalty value, from
the interpreter element to the learning agent; and
controlling, based on one of: the optimal action, the reward value, or the
penalty value, the learning agent to independently control each control
parameter of the one or more control parameter, for each operation of the
15 plurality of operations to achieve the mechanical properties of the steel coil
during the steel coil fabrication.
5. The method as claimed in claim 1, wherein the optimal action for each control
parameter of the one or more control parameter, is determined based on a Bellman
20 Equation.
6. The method as claimed in claim 1, wherein the one or more control parameters
are selected from one of: a material control parameter or a process control
parameter.
25
38
7. The method as claimed in claim 6, wherein the material control parameter
corresponds to a control percentage of materials for each operation of the plurality
of operations implemented during the steel coil fabrication.
5 8. The method as claimed in claim 7, wherein the control percentage of materials
are selected from one of: a manganese percentage (Mn%), a carbon percentage
(C%), a niobium percentage (Nb%), a phosphorus percentage (P%), a silicon
percentage (Si%), a titanium percentage (Ti%), or a nitrogen (N%) percentage.
10 9. The method as claimed in claim 6, wherein the process control parameter
corresponds to a control of various processes associated with each operation of
the plurality of operations implemented during the steel coil fabrication.
10. The method as claimed in claim 9, wherein the process control parameter is
15 selected from one of: a temperature control, a force control, or a speed control.
11. The method as claimed in claim 1, wherein the mechanical properties are
selected from one of: an Ultimate Tensile Strength (UTS), a Yield Strength (YS),
or an Elongation (EL).
20
12. The method as claimed in claim 1, wherein the plurality of operations is
selected from one of: a casting operation, a hot rolling operation, a cold rolling
operation, an annealing operation, or a galvanizing operation.
25 13. A system (302) to optimize mechanical properties of a steel coil, the system
(302) comprises:
a processor (304), configured to:
39
receive mechanical properties to be achieved for the steel coil during a
steel coil fabrication;
receive one or more control parameters for each operation of a plurality
of operations implemented during the steel coil fabrication;
5 determine, based on the one or more control parameters, an optimal
action for each control parameter of the one or more control parameters,
such that, the optimal action indicates an action required to achieve the
mechanical properties of the steel coil; and
implement a machine learning (ML) model (102) to create an
10 environment such that a learning agent associated with the environment is
configured to:
learn about the optimal action for each control parameter of the
one or more control parameters, and
control each operation of the plurality of operations of the steel
15 coil fabrication, via the learning agent.
14. The system (302) as claimed in claim 13, the system (302) further comprises:
a memory (306) communicably coupled with the processor (304),
wherein the processor (304) is further configured to:
20 store, the optimal action in a Q-matrix (106) of the memory (306),
wherein the optimal action is stored for each control parameter of the
one or more control parameter, which corresponds to each operation
of the plurality of operations; and
40
update the stored Q-matrix (106) in an iterative manner based on
the optimal action determined for corresponding operation of the
plurality of operations.
5 15. The system (302) as claimed in claim 13, wherein the processor (304) is further
configured to:
control, based on the optimal action, the learning agent to independently
control each control parameter of the one or more control parameter, for each
operation of the plurality of operations to achieve the mechanical properties of
10 the steel coil during the steel coil fabrication.
16. The system (302) as claimed in claim 13, wherein the processor (304) is further
configured to:
implement an interpreter element in the environment of the ML model (102),
15 wherein the interpreter element is configured to train the learning agent, based
on a transfer of at least one of: a reward value or a penalty value, from the
interpreter element to the learning agent; and
control, based on one of: the optimal action, the reward value, or the penalty
value, the learning agent to independently control each control parameter of the
20 one or more control parameter, for each operation of the plurality of operations
to achieve the mechanical properties of the steel coil during the steel coil
fabrication.
41
17. The system (302) as claimed in claim 13, wherein the optimal action for each
control parameter of the one or more control parameter, is determined based on a
Bellman Equation.
5 18. The system (302) as claimed in claim 13, wherein the one or more control
parameters are selected from one of: a material control parameter or a process
control parameter.
19. The system (302) as claimed in claim 18, wherein the material control
10 parameter corresponds to a control percentage of materials for each operation of
the plurality of operations implemented during the steel coil fabrication.
20. The system (302) as claimed in claim 19, wherein the control percentage of
materials are selected from one of: a manganese percentage (Mn%), a carbon
15 percentage (C%), a niobium percentage (Nb%), a phosphorus percentage (P%),
a silicon percentage (Si%), a titanium percentage (Ti%), or a nitrogen (N%)
percentage.

Documents

Application Documents

#	Name	Date
1	202331090147-STATEMENT OF UNDERTAKING (FORM 3) [30-12-2023(online)].pdf	2023-12-30
2	202331090147-REQUEST FOR EXAMINATION (FORM-18) [30-12-2023(online)].pdf	2023-12-30
3	202331090147-POWER OF AUTHORITY [30-12-2023(online)].pdf	2023-12-30
4	202331090147-FORM-8 [30-12-2023(online)].pdf	2023-12-30
5	202331090147-FORM 18 [30-12-2023(online)].pdf	2023-12-30
6	202331090147-FORM 1 [30-12-2023(online)].pdf	2023-12-30
7	202331090147-DRAWINGS [30-12-2023(online)].pdf	2023-12-30
8	202331090147-DECLARATION OF INVENTORSHIP (FORM 5) [30-12-2023(online)].pdf	2023-12-30
9	202331090147-COMPLETE SPECIFICATION [30-12-2023(online)].pdf	2023-12-30
10	202331090147-Proof of Right [27-05-2024(online)].pdf	2024-05-27
11	202331090147-FORM-26 [15-05-2025(online)].pdf	2025-05-15