Abstract: The present invention relates to a system (102) and a method (600) for managing Internet of Things (IoT) devices using a time-slotted channel hopping (TSCH) mechanism within a wireless sensor network (WSN). The system (102) utilizes a trusted third party (TTP) framework to manage communication between multiple WSN nodes (110). Data packets are processed and timeslots are allocated to each node, allowing them to transmit, receive, or enter sleep mode based on a deterministic scheduling technique. The system (102) further incorporates a reinforcement learning-based mechanism that dynamically adjusts timeslot allocations based on real-time and historical communication data, optimizing network performance. Priority levels are assigned to data packets, ensuring that high-priority packets, such as emergency alerts, are transmitted before lower-priority ones. The system (102) also employs a dynamically adaptive backoff timer to improve packet delivery efficiency by adjusting timer duration based on packet priority.
Description:TECHNICAL FIELD
[0001] The present invention relates to the field of wireless communication in Internet of Things (IoT) networks, and more particularly to a system and method for managing communication among wireless sensor network (WSN) nodes using a time-slotted channel hopping (TSCH) mechanism combined with reinforcement learning-based scheduling and prioritization techniques.
BACKGROUND
[0002] Background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[0003] The Industrial Internet of Things (IIoT) has revolutionized industrial operations by enabling real-time monitoring, predictive maintenance, and enhanced efficiency through the integration of interconnected devices, sensors, and systems. These large-scale deployments, especially in critical sectors such as healthcare, manufacturing, military, and emergency response, demand communication frameworks that ensure high reliability, minimal latency, and optimal power efficiency. In such high-stakes environments, the timely and dependable transmission of crucial data is vital—for instance, in smart healthcare, any delay in the delivery of life-critical alerts can jeopardize patient outcomes.
[0004] To address these stringent requirements, several TSCH-based solutions have been developed. Existing frameworks have been categorized into power-sensitive, latency-sensitive, and hybrid models, each employing reinforcement learning, hierarchical scheduling, and dynamic slot allocation techniques. While these solutions offer improvements, they also present certain drawbacks. Many models either prioritize power savings at the cost of delay or optimize for latency without efficiently managing energy consumption. Additionally, some protocols lack adaptability to real-time traffic fluctuations or fail to account for the varying priority of data packets, resulting in packet loss, increased collisions, or suboptimal scheduling in congested networks.
[0005] Many techniques have been evolved to obviate the above-mentioned issues, for instance, a patent document US10587501B2, discloses communication scheduling methods using depth-aware slotframes for emergency transmissions. Similarly, US20250071040A1 presents scheduling strategies with parameters like data rate, modulation schemes, and signal-to-noise ratio to enhance cellular network communication. While these references introduce innovative scheduling methods, they do not disclose or sufficiently address integration of a multi-agent reinforcement learning model specifically designed to manage both latency and power efficiency dynamically, nor do they offer prioritization mechanisms that factor in packet importance during scheduling in TSCH-based clustered IoT networks.
[0006] There is, therefore, a need for an improved solution that overcomes these limitations by offering a scalable, priority-aware, and intelligent timeslot scheduling mechanism capable of balancing power and latency requirements dynamically in large-scale clustered IIoT environments.
OBJECTS OF THE PRESENT DISCLOSURE
[0007] A general object of the present disclosure is to provide an intelligent, priority-aware reinforcement learning-based scheduling framework for optimizing performance in diverse IoT network scenarios.
[0008] An object of the present disclosure is to provide an improved packet delivery ratio (PDR) in large-scale static networks through adaptive transmission scheduling.
[0009] Another object of the present disclosure is to provide reduced collision rates in time-slotted channel hopping (TSCH) networks under dense deployment conditions.
[0010] Another object of the present disclosure is to provide lower power consumption for power-sensitive IoT applications using energy-efficient scheduling decisions.
[0011] Another object of the present disclosure is to provide minimized communication latency in latency-sensitive applications by dynamically prioritizing time slots.
[0012] Another object of the present disclosure is to provide reliable data delivery in mobile IoT environments through reinforcement learning-based adaptation.
[0013] Another object of the present disclosure is to provide near-zero latency and high PDR in event-driven networks that require real-time responsiveness.
[0014] Another object of the present disclosure is to provide reduced broadcast and unicast collision rates in networks handling critical and routine data simultaneously.
[0015] Another object of the present disclosure is to provide minimal routing overhead in heterogeneous networks through optimized scheduling and path selection.
[0016] Another object of the present disclosure is to provide enhanced transmission efficiency by prioritizing emergency or high-importance data over routine traffic.
[0017] Another object of the present disclosure is to achieve dynamic slot allocation using multiple DDQN-trained agents that evaluate network state and optimize TSCH operations in clustered WSNs.
[0018] Another object of the present disclosure is to provide reliable communication support for smart environments such as healthcare, industry, and smart homes.
[0019] Another object of the present disclosure is to provide congestion management and QoS enhancement by minimizing expected transmission count (ETX).
[0020] Another object of the present disclosure is to provide scheduling adaptability in networks with varying traffic loads, topologies, and mobility levels.
[0021] Another object of the present disclosure is to provide reinforcement learning-based orchestration of time slots for autonomous and efficient IoT communication.
SUMMARY
[0022] Aspects of the present disclosure relate to the field of wireless communication in Internet of Things (IoT) networks, and more particularly to a system and method for managing communication among wireless sensor network (WSN) nodes using a time-slotted channel hopping (TSCH) mechanism combined with reinforcement learning-based scheduling and prioritization techniques. The proposed solution introduces a dynamic, intelligent scheduling framework that leverages double-deep Q-learning (DDQN) to adaptively allocate time slots based on real-time network conditions and application priorities. This approach provides multiple advantages, including improved packet delivery ratio (PDR) in large-scale static networks, reduced collision rates under dense deployments, and lower power consumption for energy-constrained devices. Also minimizes communication latency for time-sensitive applications by prioritizing critical transmissions, ensures reliable data delivery in mobile environments, and supports near-zero latency and high reliability in event-driven scenarios.
[0023] Furthermore, the proposed solution effectively reduces broadcast and unicast collision rates, minimizes routing overhead in heterogeneous networks, and enhances transmission efficiency by prioritizing emergency or high-importance data over routine traffic. The framework is adaptable to varying network topologies and traffic loads, offering congestion management, expected transmission count (ETX) optimization, and robust performance across industrial, healthcare, and smart home environments.
[0024] An aspect of the present disclosure pertains to a system for management of Internet of Things (IoT) devices, specifically facilitating intelligent communication among wireless sensor network (WSN) nodes through a trusted third party (TTP) framework integrated with time-slotted channel hopping (TSCH) mechanisms and reinforcement learning-based scheduling. The system comprises a processor and memory that executes instructions to manage and process packets received from a plurality of WSN nodes. It allocates timeslots using the TSCH mechanism, wherein each node is selectively directed to transmit, receive, or enter a sleep mode during its allocated slot. The scheduling of these timeslots is implemented using a deterministic approach based on an orchestra algorithm, allowing each data packet to be assigned to a corresponding slot for prioritized transmission.
[0025] The system further incorporates a reinforcement learning-based mechanism that dynamically updates timeslot allocations based on predefined parameters and training data, which includes both real-time and historical communication records. Each data packet is classified and assigned a priority level, enabling the system to transmit packets in a sequence determined by the assigned priorities. The allocation of timeslots can be adaptively adjusted based on factors such as the number of incoming packets, observed transmission or reception behavior of each node, and recorded packet loss rates.
[0026] Additionally, the processor is configured to categorize WSN nodes as high-priority or low-priority depending on the urgency of the packets they generate. Nodes that transmit emergency alerts or critical notifications are designated as high-priority. Data packets from high-priority nodes are inserted at the front end of a TSCH-associated queue, ensuring their transmission precedes packets from low-priority nodes, which are inserted at the rear end of the queue.
[0027] In an aspect, the reinforcement learning mechanism includes multiple agents trained using a Double Deep Q-Network (DDQN), each capable of selecting from a set of predefined actions including maintaining the current network state, transmitting packets based on priority, and retransmitting failed packets. These agents utilize a shared memory to support replay and learning across distinct datasets, enhancing the system's capability to manage TSCH operations within clustered WSN topologies.
[0028] In an aspect, the processor also employs a dynamically adaptive backoff timer linked to the TSCH mechanism, wherein the duration of the backoff is determined by the priority level of each data packet. Higher-priority packets are assigned shorter backoff durations relative to those of lower priority, thereby ensuring timely and efficient communication in diverse and potentially congested network environments.
[0029] Another aspect of the present disclosure pertains to a method for managing Internet of Things (IoT) devices within a wireless sensor network (WSN), utilizing a trusted third party (TTP) framework for efficient communication management among the network nodes. The method includes receiving one or more data packets transmitted from the WSN nodes and processing them through the allocation of timeslots using a time-slotted channel hopping (TSCH) mechanism. Each node is selectively directed to either transmit, receive, or enter sleep mode during its allocated timeslot. The timeslots are scheduled using a deterministic scheduling technique based on an orchestra algorithm, which assigns each data packet to a corresponding timeslot for transmission. The method also incorporates a reinforcement learning-based mechanism to dynamically update the timeslot allocation based on predefined parameters and training data. This training data includes real-time and historical communication records, enabling the system to adapt and optimize its scheduling decisions over time. Each data packet is classified according to a predefined classification, and a priority level is assigned to it. The data packets are then transmitted in a sequence based on their assigned priority levels. Additionally, the method allows for the dynamic adjustment of timeslot allocation based on factors such as the quantity of incoming data packets, detected changes in node transmission or reception activities, or recorded packet loss rates during transmission.
[0030] Furthermore, the method includes classifying each WSN node as either a high-priority or low-priority node depending on the urgency of the data packets generated. Nodes transmitting emergency alerts or critical notifications are designated as high-priority. These high-priority nodes are given precedence in the timeslot allocation, ensuring that their data packets are transmitted ahead of those from low-priority nodes, thus optimizing communication efficiency in time-sensitive environments. This method offers enhanced communication management for IoT networks, ensuring efficient, adaptive, and priority-based packet transmission in diverse network conditions.
BRIEF DESCRIPTION OF DRAWINGS
[0031] The accompanying drawings are included to provide a further understanding of the present disclosure and are incorporated in and constitute a part of this specification. The drawings illustrate exemplary embodiments of the present disclosure and, together with the description, serve to explain the principles of the present disclosure. The diagrams are for illustration only, which thus is not a limitation of the present disclosure.
[0032] FIG. 1 illustrates an exemplary network architecture of proposed system for management of Internet of Things (IoT) devices, in accordance with an embodiment of the present disclosure.
[0033] FIG. 2 illustrates functional units of a processor associated with the proposed system, in accordance with an embodiment of the present disclosure.
[0034] FIG. 3 illustrates an exemplary architecture to depict operation of a prioritized multi-agent reinforcement learning (PMRL) time slotted channel hopping (TSCH) in proposed system, in accordance with an embodiment of the present disclosure.
[0035] FIG. 4A illustrates an exemplary view of a network with ten clusters, each containing ten nodes, in accordance with an embodiment of the present disclosure.
[0036] FIG. 4B illustrates an exemplary view of a mobile network with ten clusters, each containing ten nodes, in accordance with an embodiment of the present disclosure.
[0037] FIG. 4C illustrates an exemplary view of a heterogeneous network, in accordance with an embodiment of the present disclosure.
[0038] FIG. 5A illustrates an exemplary view of an algorithm for training process of Double Deep Q-Network (DDQN) in the proposed system, in accordance with an embodiment of the present disclosure.
[0039] FIG. 5B illustrates an exemplary view of an algorithm for simulator feedback with DDQN and experience replay, in accordance with an embodiment of the present disclosure.
[0040] FIG. 6 illustrate a flow chart of a method for managing Internet of Things (IoT) devices, in accordance with an embodiment of the present disclosure.
[0041] FIG. 7 illustrate exemplary graphical representations of performance evaluation of PMRL, where FIG. 7A shows total number of packet loss of network, FIG. 7B shows total power consumption of network, and FIG. 7C shows packet delivery ratio of the network, in accordance with an embodiment of the present disclosure.
[0042] FIG. 8A illustrates an exemplary graphical representation of latency comparison of PMRL with default, MRL, and priority protocols, in accordance with an embodiment of the present disclosure.
[0043] FIG. 8B illustrates a graphical representation of collision comparison of PMRL with default, MRL, and priority protocols, in accordance with an embodiment of the present disclosure.
[0044] FIG. 9 illustrates an exemplary graphical representation of ETX for different protocol for each node in Orchestra Scenario, in accordance with an embodiment of the present disclosure.
[0045] FIG. 10 illustrate exemplary graphical representations of performance evaluation of OPTIMA prioritized multi-agent reinforcement learning (OPMRL), where FIG. 10A shows total number of packet loss of network, FIG. 10B shows packet delivery ratio of network, and FIG. 10C shows total power consumption of the network, in accordance with an embodiment of the present disclosure.
[0046] FIG. 11A illustrates an exemplary graphical representation of latency comparison of Optimized Prioritized Multi-Agent Reinforcement Learning (OPMRL) with Optima, OMRL, and Opriority protocols, in accordance with an embodiment of the present disclosure.
[0047] FIG. 11B illustrates a graphical representation of collision comparison of OPMRL with Optima, OMRL, and Opriority protocols, in accordance with an embodiment of the present disclosure.
[0048] FIG. 12 illustrates an exemplary graphical representation of ETX for different protocol for each node in Optima Orchestra Scenario, in accordance with an embodiment of the present disclosure.
[0049] FIG. 13, illustrate exemplary graphical representations of heatmap, where FIG. 13A shows trade-off between power consumption and latency, FIG. 13B shows trade-off between power consumption and packet delivery ratio (PDR), and FIG. 13C shows trade-off between latency and PDR, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
[0050] The following is a detailed description of embodiments of the disclosure represented in the accompanying drawings. The disclosed embodiments are merely exemplary of the invention, which may be embodied in various forms. The embodiments are in such detail as to clearly communicate the disclosure. However, the amount of detail offered is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure as defined by the appended claims.
[0051] Embodiments of the present disclosure relates to the field of wireless communication in Internet of Things (IoT) networks, and more particularly to a system and method for managing communication among wireless sensor network (WSN) nodes using a time-slotted channel hopping (TSCH) mechanism combined with reinforcement learning-based scheduling and prioritization techniques.
[0052] Referring to FIG. 1 an exemplary block diagram 100 of proposed system 102 for management of Internet of Things (IoT) devices 104 is disclosed. The IoT devices 104 may be any network-connected sensing or actuating nodes that collect, transmit, or respond to data in various applications. These may include temperature sensors, motion detectors, health monitors, industrial controllers, smart home appliances, surveillance cameras, environmental monitoring sensors, wearable devices, and the like used in healthcare, industrial automation, smart cities, agriculture, transportation, and the like. The system 102 includes one or more processor(s) 106 that may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate data based on operational instructions. Among other capabilities, the processor 106 may be configured to fetch and execute computer-readable instructions stored in a memory 108. The memory 108 may store one or more computer-readable instructions or routines, which may be fetched and executed to manage booking requests. The memory 108 may include any non-transitory storage device including, for example, volatile memory such as Random Access Memory (RAM), or non-volatile memory such as an Erasable Programmable Read-Only Memory (EPROM), flash memory, and the like.
[0053] The processor 106 may be communicatively coupled to the IoT devices 104 through a communication network 112. The processor 106 is configured to manage data communication, perform packet classification, allocate transmission timeslots, and implement reinforcement learning-based scheduling strategies to optimize network performance. A trusted third party (TTP) framework is implemented to manage communication among a plurality of wireless sensor network (WSN) nodes 110. The WSN nodes 110 are configured to sense environmental or operational parameters and transmit one or more corresponding data packets to the processor 106 via the communication network 112. These nodes may include sensors, actuators, or embedded IoT devices deployed across a defined area.
[0054] The communication network 112 can be a wireless network, a wired network or a combination thereof that can be implemented as one of the different types of networks, such as Intranet, Local Area Network (LAN), Wide Area Network (WAN), Internet, and the like. Furthermore, the communication network 112 can either be a dedicated network or a shared network. The shared network can represent an association of different types of networks that can use variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like.
[0055] Referring to FIG. 2, exemplary functional units of the processor 106 for management of IoT devices 104 are disclosed. The processor 106 may also include interface(s) 206, that may include a variety of interfaces, for example, interfaces for data input and output devices, referred to as Input/Output (I/O) devices, storage devices, and the like. The interface(s) 206 may provide a communication pathway for one or more components of the vehicle. Examples of such components include but are not limited to, processing engine(s) 208 and a database 210.
[0056] In an embodiment, the processing engine(s) 208 may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing engine(s) 208. In other embodiments, the processing engine(s) 208 may be implemented by electronic circuitry. The database 210 may include data that is either stored or generated as a result of functionalities implemented by any of the components of the processing engine(s) 208. In some embodiments, the processing engine(s) 208 may include a TTP communication management module 212, a TSCH scheduling and timeslot allocation module 214, a packet classification and prioritization module 216, a reinforcement learning-based optimization module 218, a data processing and network monitoring module 220, and other module(s) 222. The other module(s) 222 may implement functionalities that supplement applications/functions performed by the system 102.
[0057] In an embodiment, the TTP communication management module 212 is configured to ensure that communication between the processor 106 and the WSN nodes 110 is both secure and reliable. This TTP communication management module 212 operates based on the trusted third party (TTP) framework, which acts as an intermediary authority to authenticate, validate, and monitor all data exchanges. By using the TTP framework, the TTP communication management module 212 helps establish trust among the WSN nodes 110 and the processor 106, prevents unauthorized access or tampering, and ensures that the transmitted data maintains its integrity and confidentiality throughout the communication process.
[0058] In an embodiment, the TSCH scheduling and timeslot allocation module 214 is configured to manage efficient and conflict-free communication among the WSN nodes 110. This processes incoming data packets from the WSN nodes 110 and assigns dedicated timeslots to each node using a Time-Slotted Channel Hopping (TSCH) mechanism. During their allocated slot, each node is either allowed to transmit, receive, or enter sleep mode to conserve energy. The scheduling of these timeslots follows a deterministic approach using the Orchestra algorithm, which ensures each data packet is correctly matched to a timeslot for smooth transmission. To maintain optimal performance and network consistency, the TSCH scheduling and timeslot allocation module 214 dynamically adjust timeslot allocations based on real-time conditions such as number of incoming packets, changes in a transmission or reception behavior of the node, or if there is a noticeable packet loss rate during communication.
[0059] In addition, the TSCH scheduling and timeslot allocation module 214 is further configured to apply a dynamically adaptive backoff timer associated with the time-slotted channel hopping (TSCH) mechanism. The backoff timer is adjusted based on the priority level of the one or more data packets, such that higher-priority packets are assigned a shorter backoff duration relative to lower-priority packets. The adaptive backoff strategy operates as follows:
This prioritization ensures faster retransmission opportunities for critical packets, enhancing responsiveness and reliability of the system 102.
[0060] In an embodiment, the packet classification and prioritization module 216 is configured to assign a priority level to each incoming data packet based on a predefined classification scheme. After assigning priorities, the packet classification and prioritization module 216 arranges transmission of packets so that packets with higher priority are transmitted earlier. This also classifies the WSN nodes 110 themselves as either high-priority or low-priority nodes depending on urgency of the data they generate. For instance, nodes that send emergency alerts or critical notifications are marked as high-priority nodes. Packets coming from high-priority nodes are inserted at a front end of a TSCH queue, ensuring their quicker transmission, while packets from low-priority nodes are placed at a rear end of the queue, as described below:
Q←P →front(Q), if Ppriority = 1
Q←P →end(Q), if Ppriority = 0
[0061] In addition, to enhance reliability of high-priority packet delivery, the proposed system 102 introduces an acknowledgement success tracking mechanism. This mechanism monitors acknowledgement (ACK) responses received from the wireless sensor network nodes 110. This mechanism gives precedence to high-priority packets by incrementing an acknowledgement count upon each successful ACK. This enables high-priority packets to be quickly retransmitted if needed:
ack count ← ack count+1, if ACK is successful
[0062] In an embodiment, the reinforcement learning-based optimization module 218 is configured to execute a reinforcement learning-based mechanism configured for dynamically updating the allocation of timeslots for WSN nodes 110. The updates are based on predefined parameters and training data, which includes both real-time communication data and historical communication records collected from the network. The reinforcement learning mechanism predicts future communication traffic patterns by analyzing these records using multiple agents, each trained with a Double Deep Q-Network (DDQN) model, which is an advanced method in reinforcement learning that improves stability and learning efficiency. Each agent is capable of choosing from a set of predefined actions: either maintain the current network state, transmit packets with respect to their assigned priority levels, or retransmit packets if a previous transmission has failed.
[0063] Additionally, these agents share a memory pool for experience replay, i.e. they collectively learn from various datasets to enhance decision-making. This collective learning helps to efficiently manage the TSCH mechanism for timeslot scheduling, especially within clustered wireless sensor network topologies, improving reliability and responsiveness of the system 102.
[0064] In an embodiment, the data processing and network monitoring module 220 is configured to handle two major responsibilities: processing incoming data and monitoring network performance. For data processing, the data processing and network monitoring module 220 organizes, filters, and prepares incoming data packets from the WSN nodes 110 for further actions like classification, prioritization, or scheduling. This ensures that the data is clean, correctly formatted, and ready for efficient transmission or storage. For network monitoring, the data processing and network monitoring module 220 continuously observe operational parameters of the network, such as packet loss rates, node activity, communication delays, power consumption, and traffic load. Thus, can detect anomalies, congestion, or any performance degradation early. The insights gathered through this monitoring help other modules make more informed decisions for optimizing communication and resource usage across the IoT network.
[0065] Referring to FIG. 3, an exemplary architecture 300 to depict operation of a prioritized multi-agent reinforcement learning (PMRL) time slotted channel hopping (TSCH) in proposed system 102 is disclosed. The system 102 integrates a priority-based MRL framework into TSCH, customized for latency-sensitive and power-sensitive networks. Utilizing the DDQN, the system 102 intelligently manages time slot allocation, optimizing critical data transmission while minimizing power consumption. By evaluating the network state and using learned policies, the DDQN dynamically selects actions to reduce collisions and improve efficiency, tracking metrics like expected transmission count (ETX). A reward mechanism ensures ongoing optimization, rewarding ETX reduction, and penalizing inefficiencies, as detailed in the following subsections. Although ETX is traditionally used to assess link reliability, the proposed disclosure extends its application to further enhance network performance. By frequently updating ETX based on recent transmission success rates, the system reduces unnecessary retransmissions—especially benefiting high-priority nodes that demand timely and reliable data delivery.: ETX =
[0066] In addition, a network topology is modeled as a graph G = (N,E), where N denotes the set of nodes and E represents the set of communication links. This topology includes three distinct scenarios: homogeneous, heterogeneous, and mobile multi-hop networks as shown in FIGs. 4A, 4B, and 4C respectively. In each case, nodes n ∈ N are organized in clusters around a central sink node (node 1, green node), with links E defining multi-hop communication paths between nodes. In the homogeneous network, as shown in FIG. 4A, all the nodes generate packets at uniform intervals. This consistent packet generation allows for a straightforward TSCH schedule with timeslots U ∈ H, enabling synchronized and efficient packet transmission. In this setup, the yellow nodes represent trusted third parties (TTPs), which are equipped with event timers to forward data from cluster nodes to the sink node using routing protocol for low-power and lossy networks (RPL). This configuration minimizes collisions and promotes reliable communication within the cluster.
[0067] In the heterogeneous network, as show in FIG. 4B, where nodes have varied packet generation intervals. This diversity in traffic requires adaptive forwarding paths W and flexible TSCH slot scheduling H, coordinated by TTPs to handle the irregular packet flow effectively. These adaptive mechanisms enable the network to accommodate different data rates while maintaining efficient communication.
[0068] In the mobile multi-hop network, as shown in FIG. 4C, where nodes within each cluster move dynamically, varying their proximity to the sink node. These mobility-induced changes affect the set of links |E| and alter forwarding paths W. In this case, the TTPs maintain stable communication paths, adapting to fluctuations in link quality to ensure reliable data forwarding despite mobility. Addition ally, there exists an exceptional scenario, termed the critical event-driven network. This scenario mirrors the architecture of the homogeneous network, but introduces a critical node that transmits application packets at varying timestamps. If this node detects an important event at a specific timestamp t (e.g., 3300 seconds), it is prioritized for immediate transmission to the sink node via the TTP. While the critical node sends data upon event detection, remaining nodes continue to transmit at regular intervals. The configuration details for each network are summarized in Table I, where node number of the prioritized nodes is also mentioned.
TABLE I: NETWORK CONFIGURATIONS AND PRIORITIZED NODES
[0069] Referring to FIG. 5A, an exemplary view of an algorithm for training process of Double Deep Q-Network (DDQN) in the proposed system is disclosed, in accordance with an embodiment of the present disclosure. The proposed system 102 implements a multi-agent reinforcement learning (MRL) framework configured to manage five complex tasks across clusters in large-scale wireless sensor networks. Two reinforcement learning approaches, Double Deep Q-Network (DDQN) and Proximal Policy Optimization (PPO) are evaluated. After extensive experimentation with parameter tuning, the configurations listed in Table II are found to deliver optimal performance for a network topology consisting of 10 clusters, each containing 10 nodes. DDQN consistently achieves higher reward values than PPO under their respective optimal settings, and therefore, the proposed system 102 adopts DDQN as the final algorithm due to its superior reward optimization capabilities.
TABLE II: COMPARISON OF PARAMETERS FOR DDQN AND PPO MODELS
[0070] In the proposed system, each agent in the MRL framework selects from three possible actions: at1, at2, and at3, each are optimizing a specific aspect of the Time-Slotted Channel Hopping (TSCH) network: at1- Maintain Current State: the agent keeps the network state unchanged when no optimization is needed, preserving system resources, at2- Transmit with Prioritization: the agent prioritizes transmission of high-priority data packets, thereby reducing the Expected Transmission Count (ETX) and ensuring reliable delivery of critical information, and at3- Retransmission/Reattempt: the agent re-attempts packet transmission in case of failure, improving likelihood of successful data delivery. The proposed system 102 employs three DDQN agents, each trained on distinct datasets (Dataset1, Dataset2, Dataset3). These agents optimize their action-value functions through a balance of exploration and exploitation. By utilizing a shared replay memory, the agents learn from diverse experiences and independently fine-tune their policies. This design enables robust, scalable, and priority-aware management of TSCH scheduling within clustered wireless sensor network topologies.
[0071] In addition, to balance both latency minimization and power efficiency, the proposed system integrates Prioritized Multi-Agent Reinforcement Learning (PMRL) and Optimized Prioritized Multi-Agent Reinforcement Learning (OPMRL) models within a Time-Slotted Channel Hopping (TSCH) network. These models incorporate priority mechanisms to optimize communication performance. Specifically, PMRL is implemented on the Orchestra scheduler (Default), which efficiently manages power consumption, while OPMRL is deployed on the OPTIMA Orchestra scheduler, which is known for maximizing reliability and minimizing latency. Algorithm 2, as disclosed in FIG. 5B, illustrates that each agent in the multi-agent reinforcement learning model interacts with a real-time TSCH simulation environment. Upon receiving updated network state (st), the agent selects an action (at) and sends it to the environment. The environment then executes the action by adjusting queue management or transmission priorities, which directly impacts network metrics such as Expected Transmission Count (ETX), number of transmissions (numtx), and acknowledgements (ACKs). The environment evaluates the impact of selected action (at) and returns a reward (rt) based on observed outcome. The agent uses this feedback to refine its policy by incorporating the transition tuple (st, at, rt, st+1), updating its action-value function Q(s, a; θ) accordingly. The expected returns are expressed as:
Q(s, a; θ) = r + γmaxa′ Q(s′, a′ ;θ−),
where γ is discount factor that emphasizes future rewards. Through this continuous feedback loop, agents iteratively improve their policies to maximize cumulative rewards. By dynamically prioritizing high-value data packets, PMRL and OPMRL significantly enhance TSCH network performance. These models are thus well-suited for latency-sensitive and power-constrained applications through real-time, simulation-based learning.
[0072] In an exemplary implementation, the proposed system 102 utilizes TSCH-Sim, a JavaScript-based simulator that supports large-scale and mobile network simulations, and demonstrates superior performance compared to existing tools like OpenWSN and Cooja. Simulations are conducted on an Intel Core i7 CPU with 16GB RAM, modeling a variety of network topologies, including homogeneous, heterogeneous, and critical event-driven networks. For power-sensitive applications, the Default Orchestra scheduler is employed, while OPTIMAOrchestra is adopted for latency-sensitive scenarios. The proposed system 102 introduces priority mechanisms into the baseline TSCH scheduler and evaluates their performance through comparative analysis. This benchmarks the base TSCH protocol against the proposed PMRL and OPMRL frameworks, which utilize a Double Deep Q-Network (DDQN)-based scheduling strategy within the TSCH environment. The configurations and notation schemes used for these protocols are detailed in Table IV, and comparisons are made based on performance metrics outlined therein. To enable this evaluation, TSCH-Sim is modified to incorporate priority-based scheduling and integrated reinforcement learning for real-time simulations. The simulation evaluates the following protocols under both static and mobile environments: Default (baseline Orchestra), Priority-enhanced TSCH, PMRL (Prioritized Multi-Agent Reinforcement Learning), and OPMRL (Optimized PMRL using OPTIMAOrchestra). Further, performance is assessed through metrics such as latency, power consumption, and transmission reliability, with results summarized in Table III, demonstrating effectiveness of the proposed system 102 in supporting both power- and latency-sensitive IoT applications.
TABLE III: SIMULATION PARAMETERS
TABLE IV: PERFORMANCE PARAMETERS FORMULATION
[0073] Referring to FIG. 6, a flow diagram of the proposed method 600 for managing Internet of Things (IoT) devices is disclosed. The method 600 begins at step 602, where a Trusted Third Party (TTP) framework is implemented. This framework coordinates and manages communication between multiple Wireless Sensor Network (WSN) nodes, ensuring that the communication process is secure, organized, and centrally supervised.
[0074] Continuing further, at block 604, the method 600 includes receiving, by a processor 106, one or more data packets transmitted from the WSN nodes.
[0075] Continuing further, at block 606, the method 600 includes processing, by the processor 106, the received data packets and allocating timeslots to each WSN node using a time-slotted channel hopping (TSCH) mechanism. Each node is selectively directed to transmit, receive, or enter sleep mode during the allocated timeslot. The method 600 further includes adjusting the allocation of the timeslots based on at least one of the following parameters: quantity of the incoming data packets, a detected change in transmission or reception activity of each node, or a recorded packet loss rate during transmission.
[0076] Continuing further, at block 608, the method 600 includes scheduling, by the processor, the allocated timeslots using a deterministic scheduling technique based on an orchestra algorithm. Each data packet is thereby assigned a precise transmission timeslot, reducing the chances of collision and enabling predictable communication patterns across the network.
[0077] Continuing further, at block 610, the method 600 includes executing, by the processor, a reinforcement learning-based mechanism. This mechanism is trained to dynamically update the allocation of timeslots based on predefined parameters and training data. The training data includes both real-time inputs and historical communication records, which help the method 600 learn from past patterns and optimize future decisions.
[0078] Continuing further, at block 612, the method 600 includes assigning, by the processor, a priority level to each data packet. This priority is based on a predefined classification system. Based on the assigned priorities, the method 600 further determines the order in which the packets are transmitted, ensuring that more critical or urgent data is delivered first.
[0079] The method 600 further includes classifying each WSN node as either a high-priority node or a low-priority node, depending on the urgency of the data it generates. Nodes that produce emergency alerts or essential notifications are marked as high-priority. These nodes and their corresponding data are then treated with higher urgency, ensuring faster and more reliable transmission.
[0080] Referring to FIG. 7 exemplary graphical representations of performance evaluation of prioritized multi-agent reinforcement learning (PMRL) in a static network of 100 nodes, organized into 10 clusters with identical packet intervals is disclosed. The evaluation focuses on power-sensitive scenarios, where packet loss, power consumption, and packet delivery ratio (PDR) are key performance indicators. FIG. 7A shows that PMRL achieves the lowest packet loss, with only 9 lost packets, which is approximately 20 times fewer than the Default protocol. This improvement results from integrating priority mechanisms with reinforcement learning, effectively reducing collisions. FIG. 7B illustrates that PMRL achieves a high packet delivery ratio of 99.29%, showing a 17.37% increase compared to the Default protocol. This indicates that PMRL not only enhances delivery but also maintains power efficiency. Despite emphasizing packet prioritization, PMRL consumes less power than the Priority protocol, which shows about 5.56% higher power usage due to fixed priority queuing. In contrast, PMRL dynamically learns which packets to prioritize, leading to more efficient energy use, as depicted in FIG. 7C.
[0081] In addition, latency performance is shown in FIG. 8A, where PMRL records the lowest average delay of 0.04 seconds, outperforming other protocols by dynamically adjusting the TSCH schedule to current network conditions. FIG. 8B highlights PMRL’s superior collision management. In TSCH Broadcast, the collision rate is just 0.11%, and in Unicast scenarios, collisions are reduced by 34.54%. RPL Broadcast and Unicast collisions drop significantly by 53.6% and 75.24%, respectively, due to PMRL’s adaptive scheduling and learned transmission strategies. Further, per-node ETX analysis confirms PMRL’s transmission efficiency for Default, MRL, Priority, and PMRL protocols are depicted in FIG. 9. For example, the ETX value for Node 18 reduces from 265 in the Default protocol to 139 in PMRL, while Node 50’s ETX drops from 267 to 191. These reductions demonstrate PMRL’s ability to minimize retransmissions and improve overall network reliability by combining packet prioritization with machine learning.
[0082] Further, for latency-sensitive networks, the proposed system 102 demonstrates superior performance across all key metrics. As shown in FIG. 10A, OPriority and OPMRL eliminate packet loss entirely, outperforming OPTIMA and OMRL, which each have one lost packet. This is achieved through optimized slotframes and dynamic packet prioritization. FIG. 10B illustrates that OPriority and OPMRL achieve a perfect packet delivery ratio of 100%, aided by reinforcement learning and custom slot scheduling. However, FIG. 10C indicates that this high reliability comes with increased power consumption, where OPriority and OPMRL consume 2.18 mW—higher than OPTIMA and OMRL. Despite the trade-off, the enhanced performance justifies the additional power usage.
[0083] In addition, FIG. 11A shows minimal latency, with OPMRL and OPriority achieving the lowest delays of 0.081 and 0.082 seconds due to prioritized transmission of critical packets. FIG. 11B highlights that OPMRL achieves the lowest collision rates, particularly in TSCH Broadcast (0.04%), and maintains low rates across all transmission types, including unicast and RPL scenarios. Further, FIG. 12 confirms that OPMRL records the lowest ETX values across all nodes, ensuring fewer retransmissions. For example, Node 50’s ETX drops to 155, compared to 291 with OPTIMA. Even in edge nodes like 94 and 107, OPMRL achieves lower ETX values, demonstrating robust and efficient communication under diverse network conditions.
[0084] In an embodiment, FIG. 13 illustrates heat map that shows trade-offs between power consumption, latency, and PDR. In FIG. 13A, PMRL performs best overall, offering near-optimal PDR (only 0.71% less than OPMRL) while consuming 53.97% less power. FIG. 13B shows that OPTIMA-based protocols like OPMRL achieve high PDR and low latency but at the cost of higher power. FIG. 13C highlights PMRL and OPMRL as top performers, combining high PDR with low latency, with PMRL being more power-efficient.
[0085] In an exemplary implementation for a mobile network, device mobility introduces challenges like signal fluctuation, increased latency, and higher power use. Table V compares protocol performance in a homogeneous mobile network, where all protocols achieve 100% PDR with no packet loss. PMRL and Priority consume the least power at 0.51 mW, while OPriority and OPMRL achieve the lowest latency of 0.046 seconds and minimum collision rates. Their optimized slotframes, priority mechanisms, and multi-agent reinforcement learning enable efficient handling of mobility-related issues.
TABLE V: PERFORMANCE ANALYSIS IN MOBILE ENVIRONMENT
[0086] In an exemplary implementation for a heterogeneous network, as shown in FIGs. 4A, 4B, 4C, nodes differ in type and data transmission intervals. Table VI highlights that OPMRL delivers the highest performance with nearly 99.9% PDR, reducing packet loss from 476 to just 1 and lowering latency from 2.86 s to 0.066 s compared to the Default protocol. It also cuts TSCH and RPL overhead significantly, including an 80% drop in RPL Unicast. Although OPMRL and OPriority consume more power at 1.29 mW, PMRL remains more efficient at 0.55 mW while maintaining a solid 95.83% PDR.
TABLE VI: PERFORMANCE ANALYSIS IN HETEROGENEOUS ENVIRONMENT
[0087] In an exemplary implementation for a critical event-driven network, where nodes like gas sensors transmit only during specific events (e.g., gas leaks), PMRL and OPMRL significantly enhance performance, as shown in Table VII. These protocols prioritize critical transmissions, minimize packet loss, and achieve near-perfect PDR with reduced latency. PMRL improves power efficiency and reliability using reinforcement learning, while OPMRL further optimizes response time and reliability through the OPTIMA scheduler, ensuring effective communication in real-time, time-sensitive scenarios.
TABLE VII: PERFORMANCE ANALYSIS IN CRITICAL EVENT-DRIVEN NETWORK
[0088] In an exemplary embodiment, the proposed system 102 provides a reinforcement learning framework to optimize IoT network performance by addressing both power-sensitive and latency-sensitive applications through adaptive prioritization in TSCH networks. Using DDQN-based learning, the proposed PMRL and OPMRL models outperform traditional schedulers. PMRL improves packet delivery ratio (PDR) by 17.37%, lowers collisions by 34.54%, and reduces power use by 5.56% in static networks. OPMRL achieves 100% PDR in latency-sensitive settings, reduces latency to 0.081 seconds, and lowers collisions significantly. Both models show strong results in mobile and heterogeneous environments, with OPMRL achieving 100% PDR in mobile networks and reducing latency to 0.046 seconds. In heterogeneous setups, it achieves 99.9% PDR and cuts RPL unicast overhead by 80%, confirming adaptability to dynamic large-scale scenarios.
[0089] Thus, the present disclosure provides the system 102 and method 600 for efficient communication management in Internet of Things (IoT) networks using trusted third-party framework, time-slotted channel hopping, and reinforcement learning-based prioritization. The proposed approach enhances packet delivery, reduces latency and power consumption, and ensures adaptive scheduling in dynamic network conditions.
[0090] While the foregoing describes various embodiments of the invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. The scope of the invention is determined by the claims that follow. The invention is not limited to the described embodiments, versions, or examples, which are included to enable those having ordinary skill in the art to make and use the invention when combined with information and knowledge available to those having ordinary skill in the art.
ADVANTAGES OF THE PRESENT DISCLOSURE
[0091] The present disclosure provides a smart, priority-based reinforcement learning framework to enhance scheduling efficiency across varied IoT network scenarios.
[0092] The present disclosure seeks to improve packet delivery ratio (PDR) in large-scale static deployments by employing adaptive scheduling techniques.
[0093] The present disclosure proposes a method to minimize collision rates in densely populated TSCH-based networks.
[0094] The present disclosure aims to lower power consumption in energy-constrained IoT devices through optimized transmission scheduling.
[0095] The present disclosure endeavours to reduce communication latency in time-sensitive applications by dynamically assigning high-priority slots.
[0096] The present disclosure offers reinforcement learning-driven adaptability to ensure consistent data delivery in mobile IoT environments.
[0097] The present disclosure provides a solution for achieving near-instantaneous response and high PDR in real-time, event-triggered IoT systems.
[0098] The present disclosure targets the reduction of both unicast and broadcast collisions in networks handling simultaneous critical and non-critical data flows.
[0099] The present disclosure presents a scheduling strategy that minimizes routing overhead in heterogeneous IoT infrastructures.
[00100] The present disclosure prioritizes high-urgency data transmission to enhance the responsiveness of IoT networks during emergencies.
[00101] The present disclosure incorporates DDQN-based learning for intelligent and context-aware time slot allocation based on current network conditions.
[00102] The present disclosure supports reliable communication in smart environments, including healthcare, industrial, and residential IoT systems.
[00103] The present disclosure focuses on improving quality of service by reducing expected transmission counts and managing congestion.
[00104] The present disclosure enables adaptive scheduling to accommodate fluctuating traffic patterns, dynamic topologies, and mobility scenarios.
[00105] The present disclosure facilitates autonomous orchestration of TSCH time slots through reinforcement learning for efficient and scalable IoT communication.
, Claims:1. A system (102) for management of Internet of Things (IoT) devices, the system comprising:
a processor;
a memory coupled to the processor and storing instructions that, when executed by the processor, cause the system to:
implement a trusted third party (TTP) framework to manage communication among a plurality of wireless sensor network (WSN) nodes;
receive one or more packets transmitted from said plurality of WSN nodes;
process said one or more packets and allocate timeslots to each WSN node using a time-slotted channel hopping (TSCH) mechanism, wherein each node is selectively directed to transmit, receive, or enter a sleep mode during the allocated timeslot;
schedule the allocated timeslots using a deterministic scheduling technique based on an orchestra algorithm, wherein each data packet is assigned to a corresponding timeslot for transmission;
execute a reinforcement learning-based mechanism configured to update allocation of the timeslots based on predefined parameters and training data, the training data comprising real-time and historical collected communication records; and
assign a priority level to each data packet based on a predefined classification, and transmit the one or more packets in a sequence determined by the assigned priority levels.
2. The system (102) as claimed in claim 1, wherein the allocation of the timeslots is adjusted based on at least one of: quantity of the incoming one or more data packets, a detected change in transmission or reception activity of each node, or a recorded packet loss rate during the transmission.
3. The system (102) as claimed in claim 1, wherein the processor is configured to classify each wireless sensor network node as a high-priority node or a low-priority node based on urgency of the one or more data packets generated by each wireless sensor network node, and wherein at least one of the wireless sensor network node is designated as the high-priority node when the one or more data packets comprise emergency alerts or essential notifications.
4. The system (102) as claimed in claim 3, wherein the processor is configured to insert the one or more data packets from the high-priority nodes at a front end of a queue associated with the time-slotted channel hopping (TSCH) mechanism and insert the one or more data packets from the low-priority nodes at a rear end of the queue, such that the one or more data packets from the high-priority nodes are scheduled for transmission prior to the one or more data packets from the low-priority nodes.
5. The system (102) as claimed in claim 1, wherein the reinforcement learning-based mechanism predicts communication traffic patterns using the real-time and historical communication records.
6. The system (102) as claimed in claim 1, wherein the reinforcement learning mechanism comprising a plurality of agents trained using Double Deep Q-Network (DDQN), each agent configured to select one of a plurality of predefined actions comprising:
maintain a current state of network;
transmit the one or more data packets with priority based on the assigned priority level; and
retransmit the one or more data packets upon failure of a previous transmission attempt,
wherein the plurality of agents utilize a shared memory for replay and learning on distinct datasets to manage operations of the time-slotted channel hopping (TSCH) mechanism within clustered wireless sensor network topologies.
7. The system (102) as claimed in claim 1, wherein the processor is further configured to apply a dynamically adaptive backoff timer associated with the time-slotted channel hopping (TSCH) mechanism, wherein the backoff timer is adjusted based on the priority level of the one or more data packets, such that the one or more data packets with higher priority are assigned a shorter backoff duration relative to the one or more data packets with a lower priority.
8. A method (600) for managing Internet of Things (IoT) devices, comprising:
implementing (602) a trusted third party (TTP) framework to manage communication among a plurality of wireless sensor network (WSN) nodes;
receiving (604), by a processor, one or more data packets transmitted from the plurality of WSN nodes;
processing (606), by the processor, the one or more data packets and allocating timeslots to each WSN node using a time-slotted channel hopping (TSCH) mechanism, wherein each node is selectively directed to transmit, receive, or enter sleep mode during the allocated timeslot;
scheduling (608), by the processor, the allocated timeslots using a deterministic scheduling technique based on an orchestra algorithm, wherein each data packet is assigned to a corresponding timeslot for transmission;
executing (610), by the processor, a reinforcement learning-based mechanism configured to update allocation of the timeslots based on predefined parameters and training data, the training data comprising real-time and historical collected communication records; and
assigning (612), by the processor, a priority level to each data packet based on a predefined classification, and transmitting the one or more data packets in a sequence determined by the assigned priority levels.
9. The method (600) as claimed in claim 8, further comprising adjusting the allocation of the timeslots based on at least one of: the quantity of the incoming one or more data packets, a detected change in transmission or reception activity of each node, or a recorded packet loss rate during transmission.
10. The method (600) as claimed in claim 8, further comprises classifying each wireless sensor network node as a high-priority node or a low-priority node based on urgency of the one or more data packets generated by each wireless sensor network node, and designating at least one of the wireless sensor network nodes as the high-priority node when the one or more data packets comprise emergency alerts or essential notifications.
| # | Name | Date |
|---|---|---|
| 1 | 202541050505-STATEMENT OF UNDERTAKING (FORM 3) [26-05-2025(online)].pdf | 2025-05-26 |
| 2 | 202541050505-REQUEST FOR EXAMINATION (FORM-18) [26-05-2025(online)].pdf | 2025-05-26 |
| 3 | 202541050505-REQUEST FOR EARLY PUBLICATION(FORM-9) [26-05-2025(online)].pdf | 2025-05-26 |
| 4 | 202541050505-FORM-9 [26-05-2025(online)].pdf | 2025-05-26 |
| 5 | 202541050505-FORM FOR SMALL ENTITY(FORM-28) [26-05-2025(online)].pdf | 2025-05-26 |
| 6 | 202541050505-FORM 18 [26-05-2025(online)].pdf | 2025-05-26 |
| 7 | 202541050505-FORM 1 [26-05-2025(online)].pdf | 2025-05-26 |
| 8 | 202541050505-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [26-05-2025(online)].pdf | 2025-05-26 |
| 9 | 202541050505-EVIDENCE FOR REGISTRATION UNDER SSI [26-05-2025(online)].pdf | 2025-05-26 |
| 10 | 202541050505-EDUCATIONAL INSTITUTION(S) [26-05-2025(online)].pdf | 2025-05-26 |
| 11 | 202541050505-DRAWINGS [26-05-2025(online)].pdf | 2025-05-26 |
| 12 | 202541050505-DECLARATION OF INVENTORSHIP (FORM 5) [26-05-2025(online)].pdf | 2025-05-26 |
| 13 | 202541050505-COMPLETE SPECIFICATION [26-05-2025(online)].pdf | 2025-05-26 |
| 14 | 202541050505-Proof of Right [20-08-2025(online)].pdf | 2025-08-20 |
| 15 | 202541050505-FORM-26 [20-08-2025(online)].pdf | 2025-08-20 |