Method And System For Estimation Of Microbial Community Structure

< Back

Method And System For Estimation Of Microbial Community Structure

Abstract: The present disclosure relates to a method and system for facilitating estimation of microbial community structure. The method may be performed by a set of instructions through a processor and may include the steps of receiving a first dataset pertaining to a plurality of microbial species to be assembled, obtaining, a second dataset pertaining to monoculture microbial species abundance, obtaining, a third dataset pertaining to leave one out microbial species, executing, a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set pertaining to interactions of a first microbial species mapped with a second microbial species among the plurality of species, executing a predictive analysis on the transformed data set based on the executed population dynamics model , determining, steady state abundances of the plurality of species to form the microbial community structure.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

27 March 2021

Publication Number

39/2022

Publication Type

INA

Invention Field

BIO-CHEMISTRY

Status

info@khuranaandkhurana.com

Parent Application

Applicants

Indian Institute of Science

C V Raman Road, Bangalore - 560012, Karnataka, India.

Inventors

1. DIXIT, Narendra Madhukar

Department of Chemical Engineering, Indian Institute of Science, Bangalore - 560012, Karnataka, India.

2. ANSARI, Aamir Faisal

Department of Chemical Engineering, Indian Institute of Science, Bangalore - 560012, Karnataka, India.

Specification

Claims:1. Method for estimation of microbial community structure, said method comprising:
receiving, by a processor, a first dataset, said dataset pertaining to a plurality of monoculture microbial species to be interactively assembled, wherein the number of monoculture microbial species in the plurality of monoculture microbial species is represented by n;
obtaining, by the processor, a second dataset, said second dataset pertaining to abundance of each of the plurality of monoculture microbial species;
obtaining, by the processor, a third dataset, said third dataset pertaining to leave at least one-out microbial species;
executing, by the processor, a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set, wherein the transformed dataset pertains to interactions of at least a first microbial species mapped with at least a second microbial species among the plurality of monoculture microbial species;
executing, by the processor, a pre-determined population dynamics model to estimate the microbial community structure, wherein upon execution of the pre-determined population dynamics model, steps required in experimental analysis of said microbial community structure reduces from to 2n.
2. The method as claimed in claim 1, wherein each of the first dataset, second dataset, third dataset and the transformed data set comprises of at least one of data tables, data sheets, and data matrices, and wherein each of the data tables, the data sheets, and the data matrices has a plurality of attributes including at least one of a row, a column, and a list.
3. The method as claimed in claim 1, wherein each species of the plurality of monoculture microbial species of the first dataset is tagged with a unique predefined code.
4. The method as claimed in claim 1, wherein the first set of instructions is configured to map the species interactions based on the tagged first data set, the second data set and the third dataset, wherein any or a combination of a pairwise and a higher order species interactions generate the transformed dataset, wherein the pairwise interactions comprises at least two species interactions and wherein the higher order species interactions comprise three or more species interactions.
5. The method as claimed in claim 1, wherein the third dataset pertaining to leave-one- out species are obtained by leaving out at least one species at a time from a microbial community to determine impact of said one species on the structure of the microbial community.
6. The method as claimed in claim 1, wherein the mapping of the microbial species interactions is based on effective pairwise interaction, wherein the effective pairwise interaction corresponds to net effect of at least the first microbial species on growth rate of at least the second microbial species through the pairwise and the higher-order interactions.
7. A system facilitating estimation of microbial community structure, said system comprising a processor that executes a set of executable instructions that are stored in a memory, upon execution of which, the processor causes the system to:
receive a first dataset, said dataset pertaining to a plurality of monoculture microbial species to be interactively assembled, wherein the number of monoculture microbial species in the plurality of monoculture microbial species is represented by n;
obtain a second dataset, said second dataset pertaining to monoculture microbial species abundance;
obtain a third dataset, said third dataset pertaining to leave at least one out microbial species;
execute a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set, wherein the transformed dataset pertains to interactions of at least a first microbial species mapped with at least a second microbial species among the plurality of species.
execute a pre-determined population dynamics model to estimate the microbial community structure, wherein upon execution of the pre-determined population dynamics model, steps required in experimental analysis of said microbial community structure reduces from to 2n.
8. The system as claimed in claim 7, wherein each species of the plurality of species of the first dataset is tagged with a predefined code, wherein the first set of instructions is configured to map the species interactions based on the tagged first data set, the second data set and the third dataset, wherein any or a combination of pairwise and higher order species interactions generate the transformed dataset.
9. The system as claimed in claim 7, wherein the third dataset pertaining to leave at least one out species are obtained by leaving at least one species at a time out from a microbial community to determine impact of one species on the microbial community.
10. The system as claimed in claim 7, wherein the mapping of the species interactions is based on effective pairwise interaction, wherein the effective pairwise interaction corresponds to net effect of at least the first species on growth rate of at least the second species through the pairwise and the higher-order interactions.
, Description:TECHNICAL FIELD
[1] The present disclosure relates to a method for design and performance of experiments for estimating microbial population dynamics with species interactions. More importantly, the present disclosure relates to a method and a system for estimating microbial community structure through species interactions.

BACKGROUND
[2] Background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[3] Multispecies communities of microorganisms, or microbiomes, inhabit diverse natural ecosystems, such as soil, forests and oceans, and form critical components of the ecosystems. The soil microbiome, for instance, is involved in the biogeochemical cycling of nutrients and is important to plant and animal life, especially under changing climatic conditions. Microbiomes also occupy niches on the human body and influence human health. Alterations in the gut microbiome, for instance, can adversely affect our digestion, immunity, and mental well-being. Consequently, enormous efforts are ongoing to unravel the principles governing the sustenance and functioning of microbiomes and to devise strategies to prevent their disruption and/or engineer their restoration. Transplantation of fecal microbiota and the administration of prebiotics and probiotics are examples of advanced strategies under investigation for the restoration of gut health. Such efforts have revealed the many advantages that communities have over isolated individual species and triggered a keen interest in designing synthetic microbial communities for a variety of applications in biotechnology and healthcare.
[4] Approaches to engineering microbiomes hinge on the ability to predict or estimate the structures of communities formed by assembling chosen microbial species. This ability is currently limited and poses a key challenge in microbial ecology and biotechnology. Specifically, it is difficult to predict whether communities will remain stable in a given nutrient environment, and, if they do, what their structures, in terms of the relative abundances of the different species involved, will be. The difficulty arises because of the complex nature of the interactions between the species involved, which in turn determine the structures of the communities. Recent studies have demonstrated the existence of higher-order interactions–i.e., interactions involving more than two species–underlying community structures. Such interactions, are particularly challenging to unravel. Several methods to unravel underlying interactions and predict community structures have been developed. Genome-scale metabolic reconstructions have been used together with mathematical models to infer metabolic interactions between species. Network-inference algorithms and, more recently, machine learning-based approaches have been used to deduce correlations among species abundances, as indicators of underlying interactions. Synthetic microbial communities have been created to mimic their natural counterparts and analyzed, often using mathematical models, to infer underlying interactions. While curated genome-scale models are only rarely available, network-inference and machine learning algorithms do not typically support inferences of causality. Analysis of synthetic microbial communities is thus the more widely used approach. The major challenge, however, is the prohibitively large number of experiments that must be performed. In an -species community, given the possibility of high-order interactions, one must estimate all pairwise, ternary, quaternary, etc. interactions up to the order interaction. These interactions constitute the ‘interaction map’ of the community. Interactions are typically estimated by comparing the abundances of species in relevant sub-communities. By comparing the abundance of species 1, say, in monoculture versus in coculture with species 2, the influence of species 2 on species 1 and vice versa can be estimated. Extending this argument to higher order interactions, it follows that with this ‘bottom-up’ approach, the number of species combinations, or sub-communities, that must be examined to infer the interaction map of an -species microbial community is . The number of species combinations thus increases exponentially with . For a 10-species community ( =10), for instance, the number of combinations is ~210, indicating the need for an experimental design and effort of over 1000 different sub-communities. Such ‘full-factorial’ experiments have been performed occasionally, as with the 5-member community representing the gut microbiome of Drosophila melanogaster. For typical natural communities, comprising hundreds of species, and reasonably sized synthetic communities, this combinatorial requirement will far exceed realistic experimental capabilities. Engineering microbial communities thus tends to rely on heuristic and phenomenological approaches.
[5] There is therefore a need in the art to provide a method and a system that can estimate the structure of a plurality of species community by examining just the linear multiple of the plurality of species sub-communities.

OBJECTS OF THE PRESENT DISCLOSURE
[6] Some of the objects of the present disclosure, which at least one embodiment herein satisfies are as listed herein below.
[7] It is an object of the present disclosure to provide a method that accurately predicts community structure.
[8] It is an object of the present disclosure to provide a method that is robust to parameter and model structure variations.
[9] It is an object of the present disclosure to provide a method that facilitates subsuming high-order interactions between species into effective pairwise interactions using minimal experiments.
[10] It is an object of the present disclosure to provide a system and method that facilitates experimental effort in a linear scale rather than exponential scale
[11] It is an object of the present disclosure to provide a system and method that facilitates accurate estimations for large number of species in the community, making it a promising approach for describing large, natural microbial communities.
[12] It is an object of the present disclosure to provide a method to facilitate ability to understand and predict structures of large communities using minimal, potentially tractable experiments.

SUMMARY
[13] This section is provided to introduce certain objects and aspects of the present invention in a simplified form that are further described below in the detailed description. This summary is not intended to identify the key features or the scope of the claimed subject matter.
[14] In order to achieve the aforementioned objectives, the present invention provides a method and system for estimation of microbial community structure. In an aspect, the method may include the steps of receiving, by a processor, a first dataset, the dataset pertaining to a plurality of monoculture microbial species to be interactively assembled, where the number of monoculture microbial species in the plurality of monoculture microbial species may be represented by n, obtaining, by the processor, a second dataset, the second dataset pertaining to monoculture microbial species abundance, obtaining, by the processor, a third dataset, the third dataset pertaining to leave at least one out microbial species, executing, by the processor, a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set, The transformed dataset may pertain to interactions of at least a first microbial species mapped with at least a second microbial species among the plurality of species. The method may further include the step of executing, by the processor, a pre-determined population dynamics model to estimate the microbial community structure, and upon execution of the pre-determined population dynamics model, steps required in experimental analysis of the microbial community structure may reduce from to 2n.
[15] In an embodiment, each of the first dataset, second dataset, third dataset and the transformed data set may include at least one of data tables, data sheets, and data matrices. Each of the data tables, the data sheets, and the data matrices may have a plurality of attributes including at least one of a row, a column, and a list.
[16] In an embodiment, each species of the plurality of species of the first dataset may be tagged with a predefined code.
[17] In an embodiment, the first set of instructions may be configured to map the species interactions based on the tagged first data set, the second data set and the third dataset to generate the transformed dataset. The species interactions may be any or a combination of a pairwise and a higher order species interactions. The pairwise interactions may include at least two species interactions and the higher order species interactions may include three or more species interactions.
[18] In an embodiment, the third dataset pertaining to leave at least one out species may be obtained by leaving at least one species at a time out from a microbial community to determine impact of one species on the microbial community.
[19] In another embodiment, the mapping of the microbial species interactions may be based on effective pairwise interaction. The effective pairwise interaction may correspond to net effect of at least the first microbial species on growth rate of at least the second microbial species through the pairwise and the higher-order interactions.
[20] In another aspect, the present disclosure may provide for a system facilitating estimation of microbial community structure. The system may include a processor that execute a set of executable instructions that may be stored in a memory, upon execution of which, the processor may cause the system to: receive a first dataset, said dataset pertaining to a plurality of monoculture microbial species to be interactively assembled, where the number of monoculture microbial species in the plurality of monoculture microbial species may be represented by n, obtain a second dataset, the second dataset pertaining to monoculture microbial species abundance, obtain a third dataset, the third dataset pertaining to leave at least one out microbial species, execute a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set, where the transformed dataset may pertain to interactions of at least a first microbial species mapped with at least a second microbial species among the plurality of species. The system may further cause the system to execute a pre-determined population dynamics model to estimate the microbial community structure, and upon execution of the pre-determined population dynamics model, steps required in experimental analysis of said microbial community structure may reduce from to 2n.
[21] In an embodiment, the system may be configured to tag each species of the plurality of species of the first dataset with a predefined code, wherein the first set of instructions may be configured to map the species interactions based on the tagged first data set, the second data set and the third dataset to generate the transformed dataset. The species interactions may be any or a combination of a pairwise and a higher order species interactions. The pairwise interactions may include at least two species interactions and the higher order species interactions may include three or more species interactions
[22] In another embodiment, the system may be configured to enable the third dataset pertaining to leave at least one out species may be obtained by leaving at least one species at a time out from a microbial community to determine impact of one species on the microbial community.
[23] In an embodiment, the system may be to map the species interactions may be based on effective pairwise interaction. The effective pairwise interaction may correspond to net effect of at least the first species on growth rate of at least the second species through the pairwise and the higher-order interactions.

BRIEF DESCRIPTION OF THE DRAWINGS
[24] In the figures, similar components and/or features may have the same reference label. Further, various components of the same type may be distinguished by following the reference label with a second label that distinguishes among the similar components. If only the first reference label is used in the specification, the description is applicable to any one of the similar components having the same first reference label irrespective of the second reference label.
[25] FIG. 1 illustrates exemplary network architecture in which or with which proposed system can be implemented in accordance with an embodiment of the present disclosure.
[26] FIG. 2 illustrates an exemplary architecture of a processor in which or with which proposed system can be implemented in accordance with an embodiment of the present disclosure.
[27] FIG. 3 illustrates an exemplary representation of a flow diagram for predicting microbial community structure in accordance with an embodiment of the present disclosure.
[28] FIGs. 4A-4B illustrates generic representations of flow diagram of the proposed method in accordance with an embodiment of the present disclosure.
[29] FIGs. 5A-5C illustrate exemplary implementation results of the proposed method (EPICS) in accordance with an embodiment of the present disclosure.
[30] FIG. 6A-6H illustrate generic representations of an in silico 5-member microbial community demonstration in accordance with embodiments of the present disclosure.
[31] FIG. 7A-7K illustrates generic representations of accuracy of EPICS increases with number of species how EPICS performed with different interaction strengths and in accordance with the embodiments of the present disclosure.
[32] FIG 8A-8K illustrates generic representations of robustness of EPICS in accordance with embodiments of the present disclosure.
[33] Fig. 9A-9K illustrates generic representations of EPICS being robust to variations in the distributions of the random interaction elements in accordance with embodiments of the present disclosure.
[34] FIG 10A-10K illustrates generic representations of EPICS being robust to redundant high-order interactions in accordance with embodiments of the present disclosure.
[35] Fig. 11A-11K illustrates generic representations of EPICS robust to the selection of the random interaction coefficient tensors distributions in accordance with embodiments of the present disclosure.
[36] FIG 12A-12C illustrates how EPICS captured abundances in a 5-member representative gut microbial community of Drosophila melanogaster in accordance with an embodiment of the present disclosure.
[37] FIG. 12D illustrates exemplary representations of EPICS outperforming mean-field and pairwise-only models in accordance with embodiments of the present disclosure.
[38] FIG. 13A-13C illustrates effective pairwise interactions calculated using EPICS, pairwise-only, and true interactions, in accordance with embodiments of the present disclosure
[39] FIG.13D illustrates the breakup of interactions at the order of interactions level in accordance with embodiments of the present disclosure.
[40] FIG. 14A-14D illustrates exemplary representations of EPICS outperforming pairwise interactions in predicting structures of subcommunities in accordance with embodiments of the present disclosure.
[41] FIG. 15 illustrates exemplary representations of calculating effective pairwise interactions using true interactions in accordance with embodiments of the present disclosure.

DETAILED DESCRIPTION
[42] In the following description, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present invention. It will be apparent to one skilled in the art that embodiments of the present invention may be practiced without some of these specific details.
[43] The present disclosure relates to a method for species interaction. More importantly, the present disclosure relates to a method and a system for estimating microbial community structure through species interaction.
[44] Embodiments of the present invention may be provided as a computer program product, which may include a machine-readable storage medium tangibly embodying thereon instructions, which may be used to program a computer (or other electronic devices) to perform a process. The machine-readable medium may include, but is not limited to, fixed (hard) drives, magnetic tape, floppy diskettes, optical disks, compact disc read-only memories (CD-ROMs), and magneto-optical disks, semiconductor memories, such as ROMs, PROMs, random access memories (RAMs), programmable read-only memories (PROMs), erasable PROMs (EPROMs), electrically erasable PROMs (EEPROMs), flash memory, magnetic or optical cards, or other type of media/machine-readable medium suitable for storing electronic instructions (e.g., computer programming code, such as software or firmware).
[45] Various methods described herein may be practiced by combining one or more machine-readable storage media containing the code according to the present invention with appropriate standard computer hardware to execute the code contained therein. An apparatus for practicing various embodiments of the present invention may involve one or more computers (or one or more processors within a single computer) and storage systems containing or having network access to computer program(s) coded in accordance with various methods described herein, and the method steps of the invention could be accomplished by modules, routines, subroutines, or subparts of a computer program product.
[46] Exemplary embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments are shown. These exemplary embodiments are provided only for illustrative purposes and so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those of ordinary skill in the art. The invention disclosed may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Various modifications will be readily apparent to persons skilled in the art. Thus, the present invention is to be accorded the widest scope encompassing numerous alternatives, modifications and equivalents consistent with the principles and features disclosed.
[47] The present invention provides solution to the above-mentioned problem in the art by providing a method and a system that can predict the structure of a plurality of species community by examining just a linear multiplication of sub-communities of the plurality of species. The method can use steady-state abundances of species in monocultures and leave-one-out sub-communities to predict the species community. The method can further map effective pairwise interactions representing interactions between species pairs that subsume high-order interactions to predict the species community. Referring to FIG. 1 that illustrates an exemplary network architecture (100) in which or with which system (102) of the present disclosure can be implemented, in accordance with an embodiment of the present disclosure. As illustrated in FIG. 1, according to an aspect of the present disclosure a microbial community structure estimation system (also referred to as the system 100, hereinafter) can provide estimation of the microbial community structure corresponding to a first dataset that may pertain to a plurality of monoculture microbial species to be interactively assembled, where the number of monoculture microbial species in the plurality of monoculture microbial species by represented by n.
[48] The microbial species may include unicellular and multicellular organisms such as bacteria, virus, fungi, archaea but not limited to the like.
[49] In an embodiment, the system 100 can include a modelling unit 102, one or more input devices, one or more output devices, one or more power devices, and a network 104 operatively coupled to the modelling unit 102. In an exemplary embodiment, the one or more input devices can include a keyboard, keypad 110, touchpad, and the like. The keypad 110 can be configured to acquire one or more attributes of the plurality of species and other state parameters associated the microbial community. The one or more output devices can include a display unit 112. The display unit 112 can be used to provide a visual model to the user.
[50] In an embodiment, the input devices 110 and display units 112 can be associated with one or more computing devices 106. In an embodiment, the system 100 can be implemented using any or a combination of hardware components and software components such as a cloud, a server, a computing device, a network device, and the like. The modelling unit 102 may be configured to receive the first dataset to and from the one or more computing devices 106 which can receive the first dataset via network 104 from a centralised server 108.
[51] In another embodiment, the computing device 106 can include a second dataset corresponding to monoculture of the plurality of species and a third dataset corresponding to leave one out sub-communities of the species. The second dataset and the third dataset can be stored in the memory of the computing device 106. In yet another embodiment, the dataset associated with can be user defined.
[52] In an implementation, the system 100 can be accessed by the one or more computing devices 106 through a website or application that can be configured with any operating system, including but not limited to, AndroidTM, iOSTM, Kai-OSTM and the like.
[53] Further, the system 100 can also be configured to execute a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set. The transformed dataset may pertain to interactions of at least a first species mapped with at least a second species among the plurality of species. The system 100 can also be configured to execute a predictive analysis on the transformed data set, where the execution may be done through a pre-determined population dynamics model to estimate the microbial community structure. Upon execution of the pre-determined population dynamics model, steps required in experimental analysis of the microbial community structure may reduce from to 2n. The first set of instructions may pertain to steps of algorithms such as Levenberg-Marquadt algorithm but not limited to the like to obtain the estimation of microbial community structure through species interactions.
[54] In an embodiment, the system may be configured to tag each species of the plurality of species of the first dataset with a predefined code by the first set of instructions. In another embodiment, the first set of instructions may be configured to map the species interactions based on the tagged first data set, the second data set and the third dataset. In yet another embodiment, the species interactions can generate the transformed dataset. The species interactions may be any or a combination of a pairwise and a higher order species interactions. The pairwise interactions may include at least two species interactions and the higher order species interactions may include three or more species interactions.
[55] In an embodiment, the third dataset pertaining to leave-one-out species can be obtained by leaving out one species at a time from a microbial community to determine impact of the one species on the microbial community.
[56] In another embodiment, the mapping of the species interactions may be based on effective pairwise interaction. The effective pairwise interaction may correspond to net effect of at least the first species on growth rate of at least the second species through pairwise and higher-order interactions.
[57] Examples of the computing devices can include, but are not limited to, a computing device 106, a smart phone, a portable computer, a laptop, a handheld device, a workstation and the like.
[58] In an embodiment, the system 100 can be communicatively coupled to the monitoring unit 102 through a communication unit, wherein the communication unit is a network 104 that can include any or a combination of a wireless network module, a wired network module, a dedicated network module and a shared network module.
[59] FIG. 2 illustrates an exemplary architecture of a processor 202 coupled to the system (100) in accordance with an embodiment of the present disclosure.
[60] As illustrated, the modelling unit 102 can include one or more processor(s) 202. The one or more processor(s) 202 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate data based on operational instructions. Among other capabilities, the one or more processor(s) 202 are configured to fetch and execute computer-readable instructions stored in a memory 204 of the modelling unit 102. The memory 204 can store one or more computer-readable instructions or routines, which may be fetched and executed to create or share the data units over a network service. The memory 204 can include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[61] The modelling unit 102 can also include an interface(s) 206. The interface(s) 206 may include a variety of interfaces, for example, interfaces for data input and output devices, referred to as I/O devices, storage devices, and the like. The interface(s) 206 can facilitate communication of the modelling unit 102 with various devices coupled to the modelling unit 102. The interface(s) 206 can also provide a communication pathway for one or more components of the modelling unit 102. Examples of such components include, but are not limited to, processing units 208 and database 210. In another exemplary embodiment, the datasets pertaining to a microbial species community can be stored in the database 210.
[62] Further, the processing units 208 can be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing units 208. The database 210 can include data that is either stored or generated because of functionalities implemented by any of the components of the processing units 208.
[63] In an example, the processing units 208 can include a data acquisition engine 212, a population dynamics modelling engine 214, and other unit(s) 216. The other unit(s) 216 can implement functionalities that supplement applications or functions performed by the modelling unit 102 or the processing units 208.
[64] In an embodiment, the processor may configure the data acquisition unit 212 to ingest a first dataset pertaining to a plurality of species to be assembled, obtain a second dataset pertaining to monoculture species abundance and obtain a third dataset pertaining to leave one out species. In an exemplary embodiment, the leave one out species can be obtained by leaving one species at a time out from a microbial community to determine impact of one species on the microbial community.
[65] Further, the processor may configure an population dynamics modelling engine 214 to execute a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set. In an embodiment, the transformed dataset may pertain to interactions of at least a first species mapped with at least a second species among the plurality of species. The population dynamics modelling engine then can execute a predictive analysis on the transformed data set through a predefined second set of instructions stored in a database. The population dynamics model may include the Generalized Lotka-Volterra (GLV) model but not limited to it, to estimate the effective pairwise interactions. Based on the executed population dynamics model, the population dynamics modelling engine 214 can determine steady state abundances of the plurality of species, where the steady state abundance form the microbial community structure.
[66] FIG. 3 illustrates an exemplary method flow diagram (300) depicting a method for in accordance with an embodiment of the present disclosure.
[67] At step 302, the method 300 (also referred to as Effective Pairwise Interactions for predicting Community Structures 300 or EPICS 300) includes the step of receiving, by a processor, a first dataset pertaining to a plurality of species to be assembled.
[68] Further, at step 304, the method includes the step of obtaining, by the processor, a second dataset pertaining to monoculture species abundance and at step 306, the method includes the step of obtaining, by the processor, a third dataset, the third dataset pertaining to leave one out species.
[69] Furthermore, at step 308, the method includes the step of executing, by the processor, a first set of instructions on the first dataset, the second dataset and the third dataset to obtain a transformed data set. The transformed dataset may pertain to interactions of at least a first species mapped with at least a second species among the plurality of species and at step 310, the method includes the step of executing, by the processor, a pre-determined population dynamics model to estimate the microbial community structure, and upon execution of the pre-determined population dynamics model, steps required in experimental analysis of the microbial community structure may reduce from to 2n.The system and method of the present disclosure may be further described in view of exemplary embodiments.
[70] FIGs. 4A-4B illustrates generic representations of flow diagram of the proposed method in accordance with an embodiment of the present disclosure.
[71] At step 402, the method includes the step of inputting the number of species to be assembled ('n') where 'n' should be at least greater than 2. At step 404, the method includes the step of assigning species unique number codes and remembering the species unique number of codes for inputting the abundances in the following steps. In an exemplary embodiment, the unique number code for a species should be a whole number ranging at least from '1' to 'n'.
[72] Further at step 406, the method may include the step of measuring and inputting the monoculture steady-state abundances. In an exemplary embodiment, the input should be a table with a row and 'n' columns but not limited to it, where ith column may contain the monoculture abundance of the species coded as 'i'. At step 408, the method may include the step of measuring and inputting the steady-state abundances of species in leave-one-out cultures. In an exemplary embodiment, the input can be a table with 'n' rows and 'n' columns. The ith row corresponds to the leave-one-out culture where the ith species is left out. The columns may indicate the abundances of the species in each experiment. The ith column may be for the species coded as 'i'. In yet another exemplary embodiment, the species absent in a coculture should have zero as the entry in its position in the table. If some other species also go extinct at the steady-state, their abundances should also be entered as zeros in the table.
[73] Furthermore, the method may include the step 410 of inputting a population dynamics model that species follow. In an exemplary embodiment, a default model can be Generalized Lotka-Volterra model but not limited to it.
[74] Further, the method may include the step 412 of effective interaction mapping of the community and the steady-state species abundances in the community. In an exemplary embodiment, the effective interaction map may be produced in the form of a matrix of 'n' rows and 'n' columns but not limited to it. The entry in the ith row and the jth column may be the effective pairwise effect of the species coded as 'j' on the species coded as 'i'. In another exemplary embodiment, the steady-state abundances are output as a table containing a row and 'n' columns. The entry in the ith column is the predicted steady-state abundance of the species coded as ‘i’.
[75] As illustrated in FIG. 4B, a flowchart describing the steps of the proposed method is shown. At step 422 number of species in the community can be chosen and population dynamics model can be decided at step 424. Then at step 426, setting up the interaction tensors and at step 428 calculating the steady-state abundance of all the species in the community can be performed. At step 430, the abundances are positive and locally stable may be checked. At step 432, if they are feasible and stable, the steady-state abundances in all leave-one-out communities and check if all of them are feasible and locally stable at step 436. And if they are, at step 538, leave-one-out steady-state abundances to obtain effective pairwise interactions using (EPICS). At step 440, effective pairwise interactions may be used to predict the abundances in the original community and at step 442, compare the effective pairwise interactions with the actual abundances calculated earlier using slope, Intercept RSQ, and RMS of relative error. Interaction matrices contain elements that are generated from random distributions and, thus, for each set of interactions strengths aforementioned of at least steps of 300 times may be performed but not limited to it. The method then may be repeated for all species “n”.
[76] FIGs. 5A-5C illustrate exemplary implementation results of the proposed method in accordance with an embodiment of the present disclosure.
[77] FIG. 5A illustrates schematic of the proposed method EPICS in accordance with an embodiment of the present disclosure. The FIG. 5A illustrated an interaction map of a hypothetical microbial community of at least four coded species. Each species may grow on the nutrients available and inhibits itself at high abundances. Pairwise interactions may be shown as 502 and high-order interactions as 504. Arrows may be positive interactions and blunted ends may be negative interactions. FIG. 5B illustrates species combinations – monocultures and leave-one-out cultures – that help estimate effective pairwise interactions that subsume high order interactions, based on an applicable population dynamics model. The species present in each culture are indicated by filled circles. FIG. 5C illustrates the resulting effective pairwise interaction map, which is applied to predict the community structure using the population dynamics model. The concept may be further illustrated in FIG. 5A. A community of microbial species that obeys a known population dynamics model may be chosen. The time-evolution of the population size (or abundance) of species in the community can then be written as

[78] where represents the per-capita growth rate of species , which depends on the abundances of all the species, collated in the column vector . The choice of may be determined by the population dynamics model, which in turn may depend on the nature of the underlying interactions. The proposed method (EPICS) for the widely used GLV model30–33 may be depicted. The method may work for other models too. In the GLV model with high-order interactions, the growth rate can be written as

[79] Here, is the intrinsic growth rate of species ; is a pairwise interaction coefficient, which quantifies the per capita effect of species on the growth of species ; is a ternary interaction coefficient representing the effect of species and on species ; is a quaternary interaction coefficient; and so on. Note that represents self-interactions, which is related to the carrying capacity of species , denoted , and is set by the steady-state abundance of the species in monoculture. High-order interactions, defined by coefficients , , etc., require the simultaneous presence of more than two species. The highest order of interaction possible is , which requires the simultaneous presence of all the species in the community. Elucidating all pairwise, ternary, quaternary, etc. interaction coefficients with a bottom-up approach would require steady-state data from the full-combinatorial set ( ) of experiments.
[80] In an embodiment, the effective pairwise interactions that may be obtained by subsuming contributions from higher-order interactions into pairwise interactions may be introduced in Eq. (2) as follows:

[81] where is an effective pairwise interaction coefficient, which may capture the net effect of species on the growth rate of species through pairwise and higher-order interactions. In other words, it quantifies the change in when species is removed.
[82] In an embodiment, there are infinitely many ways to estimate the effective pairwise interaction coefficients, represented as the elements of the square matrix , from higher-order interactions, because a high-order term can be partitioned non-uniquely into many associated effective pairwise interactions. For example, the high-order term in Eq. (2) can be considered to be a part of the effective pairwise interaction coefficient or of or both as illustrated in FIG. 5AA. Thus, , subsuming into ; or , subsuming it into ; or and , partitioning it into both, with and representing the partitioning fractions. To conserve growth rates, the fractions and must sum to 1. Generalizing to all high-order interactions, the effective pairwise coefficients can be written as:

[83] where is the fraction of the corresponding high-order term that may be considered to be a part of and the fractions associated with each high-order term summing to 1. There may be infinitely many ways of choosing the values of , representing as many ways of estimating . When all the individual pairwise, ternary, and the like interaction terms may be known a priori, these ways of partitioning are all equivalent and yield the same overall community structure. The method may require of species combinations to be studied, which may be the minimal set of combinations. Further, the resulting estimates may be unique and subsume all but the highest order interactions.
[84] In an embodiment, diagonal terms in may be determined from the individual carrying capacities, , obtained from monoculture experiments. (The individual growth rates, , can all be set to 1 without loss of generality when considering steady-state abundances. The interaction terms may be all then relative to .) To obtain the off-diagonal terms ( )), sub-communities may be considered, i.e., species combinations with the number of species , following a top-down approach. Thus the leave-one-out sub-communities may be considered first, which contain species each. There may be such sub-communities, each obtained by dropping one of the species. We denote by the steady-state abundance of species in the sub-community without species . Eq. (3) at steady state ( ) would then yield where . The quantity is the effective pairwise interaction coefficient quantifying the overall influence of species on in the sub-community.
[85] In an embodiment, in a community with many species ( ), the absence of a few species may affect the interactions between the remaining species, in general, only minimally. The effect would be the least when a single species is missing. Thus, an approximation that may be assumed. The steady-state conditions for the leave-one-out experiments may be transformed into a set of linear algebraic equations given by
[86] In an embodiment, a set of equations in the off-diagonal entries of may be provided. (There are species ( ) in each leave-one-out experiment and there are such experiments ( )). Using the steady-state abundances from the leave-one-out experiments, the equations can be solved for the off-diagonal entries, . Because the equations are linearly independent, the solutions may be unique. Once is thus determined, Eq. (3) can be solved for the species abundances of the original community.
[87] In another embodiment, species abundances in smaller sub-communities (with species) could also be used to estimate . The leave-one-out sub-communities may have the advantage of subsuming the highest order of interactions among all sub-communities, , rendering the estimates of the most accurate. Further, pieces of information may be presented, in terms of steady-state abundances, needed to solve for as many off-diagonal entries in with the smallest number ( ) of sub-community experiments. Any other combination of sub-communities may require more experiments because each experiment would yield fewer steady-state abundances than the leave-one-out sub-communities. Besides, sub-communities containing high-order interactions tend to become increasingly unstable upon dropping more and more species, potentially making the proposition of using smaller sub-communities problematic, a disadvantage also of bottom-up approaches.
[88] In an embodiment, the GLV model with high-order interactions to synthesize in silico microbial communities of different numbers of species, . The highest order of interactions may be fixed at least but not limited to four and set the intrinsic growth rate of each species at least to to reflect the assumption that the communities of various may be cultured in environments with similar resources. Systematic scaling of the intrinsic growth rate with may allow a fair comparison across communities of different . The steady-state abundances of species in an -species community can be obtained by solving , where the may be as in Eq. (2) ( ). The ’s may be made dimensionless by dividing them by their carrying capacities and the self-interaction coefficients are set to .
[89] In another embodiment, off-diagonal elements of may be selected from a normal distribution with mean zero and standard deviation . Symmetric distribution about ‘zero mean’ may ensure that there may be no preferred positive or negative pairwise interaction. In an embodiment, represents ‘strength’ of the pairwise interactions. Similarly, elements of the ternary and quaternary interaction tensors and may be picked from normal distributions with means zero and standard deviations and , latter representing the strengths of ternary and quaternary interactions, respectively. were recognized as second-order terms set to zero. Further, = as they can be same interaction coefficients. Similarly, , and equated terms referring to the same interactions, e.g., . With the interactions, the steady state abundances can be solved.
[90] FIG. 6A-6H illustrates an in silico 5-member microbial community demonstration in accordance with embodiments of the present disclosure. FIG. 6A illustrates an in silico 5-member microbial community that followed GLV model with high-order interactions. The strength of binary ( ) to be , ternary ( ) , and quaternary interactions ( ) may be set, and sampled interactions from the resulting normal distributions. The interactions to predict steady-state abundances in all leave-one-out sub-communities may be provided. The species may be color-coded. The total height of a bar denotes the total abundance. The composition of a sub-community may be indicated by the heights of the color-coded partitions in a bar. Black circles may denote the carrying capacities of the species indicated in the x-axis label. In an exemplary embodiment, values to all pairwise, ternary, and quaternary interaction coefficients were first assigned and predicted the structures of the five leave-one-out sub-communities as shown in FIG. 6A. FIG. 6B illustrates steady-state abundances of leave-one-out sub-communities to estimate effective pairwise interactions. Here, the interaction coefficients may be reported normalized by the self-interaction terms: . Using the abundances of the species in the communities, the effective pairwise interactions may be estimated by solving Eq. (5) as shown in FIG. 6B. FIG. 6C illustrates for effective pairwise interactions to predict the abundances in the original community and compared the abundances with the actual abundances calculated using the true interactions. The structure of the 5-member community may be predicted using the effective pairwise interactions (Eq. (3)). EPICS captured the abundances in the original community accurately as shown in FIG. 6C. FIG. 6D illustrates 300 times repetition of the method considering interactions where the community and all leave-one-out subcommunities were feasible and stable. The procedure was repeated for 300 times, each time setting interaction parameters to different values drawn from the same distributions as illustrated in FIG. 6D. FIG. 6D clearly shows that EPICS closely agreed with the bottom-up approach. In the bottom-up approach, the species abundances may be predicted using all the true pairwise, ternary, and quaternary interactions. Across the realizations, EPICS captured the abundances in the original community well in FIG. 6D. FIG. 6E illustrated for a=2.4×10^(-3),ß=0,?=0, implying pairwise interactions alone. Effective pairwise interactions may be estimated by EPICS matched the true pairwise interactions perfectly. The comparison may be extended by introducing high-order interactions as illustrated in FIG. 6F for a=2.4×10^(-3),ß=1×10^(-4),?=0; FIG. 6G illustrates for a=2.4×10^(-3),ß=0,?=1×10^(-4); and FIG. 6H illustrates for a=2.4×10^(-3),ß=1×10^(-4),?=1×10^(-4). As the high-order interactions grew, effective pairwise interactions estimated by EPICS subsumed the higher-order interactions and thus became increasingly removed from true pairwise interactions.
[91] FIG. 7A-7K illustrates accuracy of EPICS increases with number of species how EPICS performed with different interaction strengths and in accordance with the embodiments of the present disclosure. As illustrated, in an embodiment, in silico microbial communities using the GLV model with high-order interactions. We set the highest order of interactions to be four. High-order interactions were fully dense, non-redundant, and had their diagonal terms zero. As illustrated in FIG. 7A, a 3D interaction box with pairwise, ternary, and quaternary interaction strengths ( , , and ) as its dimensions is shown. The model at a total of equally spaced points in the box and performed 300 realizations at each point was performed. As illustrated, a range of interactions, depicted using a 3D interaction box with , , and as its 3 dimensions. Because most of the elements in the interaction coefficient tensors , , and C) were set randomly, not all combinations of values of and led to feasible steady-state abundances ( ). Thus, values of and were first identified that gave rise to feasible steady-state abundances in the -member community. At each feasible point in the interaction box, 300 realizations were performed by randomly choosing interactions for each realization and calculated the abundances using the bottom-up approach and using EPICS. This procedure for communities of different , ranging at least from 5 to 20 but not limited to it. FIG. 7B illustrates comparisons of predicted species abundances using EPICS with those using the bottom-up approach for a community containing five species. The legends ‘1’-‘6’ represent points with increasing distance from the origin on the diagonal while FIG. 7C illustrates average fraction of the feasible cases for the first 6 of 10 points on the 3D diagonal of the interaction box (the red line in FIG. 7A). In an exemplary embodiment, the effect of the strength of the interactions may be explored, and the method consider at least ten equally spaced points on the 3D diagonal of the interaction box but not limited to it and repeated the above calculations. FIG. 7D illustrates the root-mean-square of relative error between abundances predicted by EPICS and the bottom-up approach, averaged over all species and realizations. In an embodiment, for any , accuracy of EPICS, estimated using the deviation between the abundances predicted using the true interactions and using EPICS, decreased as the strength of the interactions increased because EPICS becomes exact when pairwise interactions alone exist and gets increasingly approximate as higher-order interactions grow and communities became more stable with increasing strengths of interactions. In another embodiment, in the 3D diagonal of the interaction box, the strengths of not only pairwise but also higher-order interactions increase. Previous studies have shown that interactions of fourth and higher order stabilize communities, explaining these observations. At each point in the interaction box, increasing increased the accuracy of the estimations. FIG. 7E illustrates eigenvalues with the largest real part of the Jacobian matrix of the GLV model at the steady-state predicted by our model. In an embodiment, the effect of eliminating species may diminish as the number of species in the community may increase. The leave-one-out communities then lead to increasingly more accurate estimates of the effective pairwise interactions and therefore better estimations of community structures using EPICS. FIGs. 7F, 7G and 7H illustrate slope, intercept, and goodness of the fit (R-squared) of the straight-line fit to the abundance data from all the 300 realizations at each point respectively. In an embodiment, at least two accuracy measures may be used to evaluate the accuracy of EPICS. In the first approach, the root-mean-square value (RMS) of relative errors averaged over all species and all realizations may be used which may be given by

[92] In the second approach, a straight line to the EPICS vs. bottom-up abundance data for each point in the interaction box may be fitted and each value of and calculated the slope, intercept, and goodness of the fit (R-squared) as illustrated in FIGs. 7F, 7G and 7H.
[93] In an exemplary embodiment, to further test the accuracy of EPICS, a straight line to the data of predicted vs. actual abundances may be fitted and may examine the goodness of fit. It was found that the slope of the line approached unity as illustrated in FIG. 7F and the intercept approached zero as illustrated in FIG. 7G as increased, indicating closer agreement between abundances predicted by EPICS and the bottom-up approach. The fit was also better as illustrated in FIG. 7H demonstrating overall the gain in accuracy of EPICS with . FIG. 7I illustrates the extended calculations beyond the 3D diagonal where the fraction of the feasible cases for each point in the interaction box may be calculated. FIG. 7J shows the region in the interaction box with feasibility above 95%. For each such feasibility threshold, ranging from 0% to 100%, the fraction of cases were calculated and FIG. 7K illustrates that accuracy improved upon increasing the number of species in the community, and average RMS error.
[94] FIG 8A-8K illustrates generic representations of robustness of EPICS in accordance with embodiments of the present disclosure. Not all communities obey the GLV model and therefore results illustrated in FIGs. 7A-7K applied to communities following a different population dynamics model or different settings of interactions were therefore examined. In an exemplary embodiment, results held for communities following the replicator dynamics model as illustrated in FIG. 8A. As illustrated, the procedure in FIGs. 7A-7K may be repeated but now with microbial communities following the replicator dynamics model instead of GLV: . Since most of the elements in interaction matrices , , and C; Eq. (2)) may be set randomly, not all values of and led to feasible steady-state abundances ( ) may be identified. Thus, combinations of and may give rise to feasible steady-state abundances in the -member community. At first, case with only pairwise interactions ( ) may be considered, where may be varied from to in logarithmic steps. For each , at least 1000 realizations were run and recorded the fraction of realizations in which all species stably coexisted at steady-state. The fraction may be called stable coexistence frequency and may be denoted by . The feasible cases may be identified to be the ones that had all steady-state abundances positive. In another embodiment, to identify stable cases, Jacobian of bracketed terms in Eq. (2) may be calculated and checked whether all of its eigenvalues had real part negative. It may be observed that as increased, the corresponding coexistence frequency followed an upside-down sigmoidal curve. In another embodiment, a critical value of ( ) may be calculated as value below which the stable coexistence frequency exceeded 95% and was repeated for a range of values of . In yet another exemplary embodiment, the procedure was repeated for cases where only ternary or only quaternary interactions existed. In an exemplary embodiment, the varied as , as , and as .
[95] Fig. 9A-9K illustrates generic representations of EPICS being robust to variations in the distributions of the random interaction elements in accordance with embodiments of the present disclosure.
[96] FIG 10A-10K illustrates generic representations of EPICS being robust to redundant high-order interactions in accordance with embodiments of the present disclosure.
[97] As illustrated in FIG. 10A, a 3D interaction box with pairwise, ternary, and quaternary interaction strengths ( , , and ) as its dimensions is shown and the procedure in FIG. 5A may be performed for higher order interactions. The model at a total of equally spaced points in the box and performed 300 realizations at each point was performed. As illustrated, a range of interactions, depicted using a 3D interaction box with , , and as its 3 dimensions. Because most of the elements in the interaction coefficient tensors , , and C) were set randomly, not all combinations of values of and led to feasible steady-state abundances ( ). Thus, values of and were first identified that gave rise to feasible steady-state abundances in the -member community. At each feasible point in the interaction box, 300 realizations were performed by randomly choosing interactions for each realization and calculated the abundances using the bottom-up approach and using EPICS. This procedure for communities of different , ranging at least from 5 to 20 but not limited to it. FIG. 10B illustrates comparisons of predicted species abundances using EPICS with those using the bottom-up approach for a community containing five species. The legends ‘1’-‘6’ represent points with increasing distance from the origin on the diagonal while FIG. 10C illustrates average fraction of the feasible cases for the first 6 of 10 points on the 3D diagonal of the interaction box (the red line in FIG. 10A). In an exemplary embodiment, the effect of the strength of the interactions may be explored, and the method consider at least ten equally spaced points on the 3D diagonal of the interaction box but not limited to it and repeated the above calculations. FIG. 10D illustrates the root-mean-square of relative error between abundances predicted by EPICS and the bottom-up approach, averaged over all species and realizations. FIG. 10E illustrates eigenvalues with the largest real part of the Jacobian matrix of the GLV model at the steady-state predicted by our model. FIGs. 10F, 10G and 10H illustrate slope, intercept, and goodness of the fit (R-squared) of the straight-line fit to the abundance data from all the 300 realizations at each point respectively. FIG. 10I illustrates the extended calculations beyond the 3D diagonal where the fraction of the feasible cases for each point in the interaction box may be calculated. FIG. 10J shows the region in the interaction box with feasibility above 95%. For each such feasibility threshold, ranging from 0% to 100%, the fraction of cases were calculated and FIG. 10K illustrates that accuracy improved upon increasing the number of species in the community, and average RMS error.
[98] Fig. 11A-11K illustrates generic representations of EPICS robust to the selection of the random interaction coefficient tensors distributions in accordance with embodiments of the present disclosure.
[99] In an embodiment, when the species in a microbial community follow population dynamics characterized by more than one interaction parameter per species pair (such as the Holling type II functional response), steady-state data of leave-one-out sub-communities would not be enough to calculate effective interactions. In such a scenario, examining ‘leave-two-out’ sub-communities, ‘leave-three-out’ sub-communities, and so on, may be required to generate the requisite steady-state data. Such a top-down approach, although requiring more than the minimal set of experiments above, still may require fewer species combinations to be explored than a corresponding bottom-up approach such as examining all duos, trios and the like.
[100] In an exemplary embodiment, models more involved than the GLV or the replicator dynamics model may be considered and may be restricted to pairwise interactions. The equation (Eq. ( 1)) may be re-written as

[101] In an embodiment, when there are no high-order interactions, Eq. (S1) simplifies to

[102] where, (also called, a functional response) can be any real-valued function and represents the functional form of the direct effect of species j on species . Some popular forms of with their names are tabulated below in TABLE 1 :

Table 1
Number of unknown parameters per species pair Name of the functional response Form of the functional response,

1 Generalized Lotka-Volterra (GLV)
2 Holling Type II (H)
3 DeAngelis-Beddington (DB)
3 Crowley-Martin (CM)

[103] Except for GLV, the functional responses incorporate more than parameters, which cannot be determined using the steady-state data of monocultures and leave-one-out communities alone. However, if the steady-state data of other sub-communities are available in number equal to or more than the number of unknown parameters, one can determine these parameters. Thus, in an exemplary embodiment, the steady-state data of leave-two-out communities along with the monocultures and leave-one-out communities may be used as a top-down approach. The top-down approach may require fewer species combinations to be explored than a corresponding bottom-down approach, where the steady-state data of at least all possible trios (three-species cultures) may be used to calculate the unknown parameters but not limited to it. Similarly, if more data may be required for an even more complex model, a subset of all possible leave-three-out communities can be explored. In an exemplary embodiment, the steady-state data from leave-two-out communities along with monocultures and leave-one-out communities may be more than what may be needed to estimate parameters for communities with for a Holling Type II functional response, and for DeAngelis-Beddington and Crowley-Martin functional responses as shown in TABLE 2.
Table 2
Number of unknown parameters per species pair Total number of unknown parameters Total number of species combinations to be examined
Bottom-up Top-down
1 (Better)

2 (Better)

[104] In an exemplary embodiment, the results also held for redundant high-order interactions with non-zero diagonal terms as illustrated in FIGs. 9A-9K and were robust to changes in the density of the high-order interaction tensors as illustrated in FIGs. 10A-10K. EPICS thus appeared robust to the specific population dynamics model and interaction strengths and distributions.
[105] In yet another exemplary embodiment, experimental values of a 5-member representative community of the gut microbiome of Drosophila melanogaster may have been carried out. The number of species combinations that may be studied increases only linearly with n with EPICS. As an example and not as limitation, at least a 10-member community may require studying of at least 20 sub-communities (10 leave-one-out and 10 single species sub-communities) may be required for EPICS. EPICS may require data from the least number of species combinations to uniquely determine the effective interaction map. The effective pairwise interactions may subsume all but the highest order (n) interactions possible. The model can then be used to predict the species abundances in the community. In another exemplary embodiment, as a way of example and not as a limitation, EPICS was first established in silico microbial communities, where the underlying interactions were precisely known and then performed with data of the 5-member gut microbial community of Drosophila melanogaster but not limited to it. The proposed method accurately described the microbial community structure of Drosophila melanogaster but not limited to it and performed better than existing methods.
[106] FIG 12A-12C illustrates how EPICS captured abundances in a 5-member representative gut microbial community of Drosophila melanogaster in accordance with an embodiment of the present disclosure.
[107] FIG. 12D illustrates exemplary representations of EPICS outperforming mean-field and pairwise-only models in accordance with embodiments of the present disclosure.
[108] . As illustrated in FIG. 10A, steady-state abundances in all leave-one-out sub-communities were measured. The species may be Lactobacillus plantarum (Lp), Lactobacillus brevis (Lb), Acetobacter pasteurianus (Ap), Acetobacter tropicalis (At), and Acetobacter orientalis (Ao) but not limited to the like and are color-coded. The total height of a bar for a sub-community may denote the corresponding total abundance, while the composition in the sub-community may be indicated by the height of the color-coded partitions in the bar. Filled black circles may denote the carrying capacities of species indicated in the x-axis label as illustrated in FIG. 10B. Estimated effective pairwise interactions and the interaction strengths may be reported normalized by the self-interaction terms: as illustrated in FIG 10C. Abundances in the original community predicted using EPICS and compared with the measured abundances.
[109] In an embodiment, the community followed GLV dynamics with high-order interactions. We used steady-state abundances in the leave-one-out sub-communities to infer effective pairwise interaction coefficients using our method and were used to predict the abundances in the original 5-member community. The estimations using EPICS captured the measured abundances in the original community accurately as illustrated in FIGs. 10A-10D.
[110] FIG. 13A-13C illustrates effective pairwise interactions calculated using EPICS, pairwise-only, and true interactions, in accordance with embodiments of the present disclosure.
[111] FIG.13D illustrates the breakup of interactions at the order of interactions level in accordance with embodiments of the present disclosure.
[112] . In an exemplary implementation, for representative gut microbial community of Drosophila melanogaster, estimations of steady-state abundances by various methods with the experimental abundances in the 5-member community were compared. As illustrated EPICS performed better than the simple, mean-field model in which the abundance of a species may be calculated as the mean of its experimental abundances in all the richest lower-diversity combinations in which it may be observed. The mean-field model may be used to estimate the abundance of a species as the mean of its experimental abundances in all the lower-diversity combinations. In an exemplary embodiment, the mean-field model may be used to predict the abundance of a species in a community as the mean of its steady-state abundances in all leave-one-out sub-communities in which it was observed. For example, and not as limitation, the abundance of species 1 in the 5-member community (1-2-3-4-5) would be the mean of its abundances in the sub-communities (1-2-3-4), (1-2-3-5), (1-2-4-5), and (1-3-4-5).
[113] FIG. 14A-14D illustrates exemplary representations of EPICS pairwise-only models in predicting structures of subcommunities in accordance with embodiments of the present disclosure. For the representative 5-member gut microbial community of Drosophila melanogaster, the effective pairwise interactions may be estimated using the steady-state abundances in leave-one-out subcommunities to predict the steady-state abundances in at least two-, three- and five-species communities but not limited to the like. True pairwise interactions may be estimated using the steady-state abundances of all two-species communities and used them to predict the abundances in all three-, four-, and five-species communities but not limited to the like and the effective pairwise interactions outperformed the true pairwise interactions in predicting the abundances in the sub-communities and the original community as illustrated in FIG. 14A. FIG. 14B illustrated direct comparison of the experimental abundances with the predicted abundances. The estimations may be made using effective pairwise interactions (filled triangles) and the true pairwise interactions (filled circles). The species present in each experiment may be indicated by the filled dots in the x-axis labels. The x-labels may be placed below panel as illustrated in FIG. 14D and apply to panels as illustrated in FIGs. 14B-14D. In an exemplary embodiment, the height of a bar shows the log10 value of the total abundance in the corresponding community. The composition of the community is indicated by bars on a linear scale. FIGs. 14C and 14D show the ratios of the predicted abundances to the actual abundances. In an embodiment, data used to estimate the interaction parameters in each case may be identified as training data. Because the steady-state abundances in all two-species sub-communities were available, all true pairwise interactions could be estimated. The true pairwise interactions, expectedly, differed considerably from the effective pairwise interactions, indicating the presence of high-order interactions. In an embodiment, the high-order interactions using the steady-state abundances of at least all three-, four-, and five-species communities but not limited to the like using the bottom-up approach high-order interactions along with the pairwise interactions shaped the structure of this community.
[114] FIG. 15 illustrates exemplary representations of calculating effective pairwise interactions using true interactions in accordance with embodiments of the present disclosure
[115] In yet another embodiment, simple partitioning of the interactions into effective pairwise interactions, yielded interactions removed from the estimates obtained using EPICS. The calculations may suggest that a comparison of effective and true pairwise interactions could be used for the purpose. If the two differ significantly, high-order interactions may be deemed to exist. Such a comparison would require at least all 2-member sub-communities to be studied in addition to the 2n above, which may likely to be more efficient than the full-factorial set. Only if the existence of high-order interactions may be established would one examine additional sub-communities.
[116] In yet another exemplary embodiment, when the effective pairwise interactions were used to predict species abundances in smaller sub-communities ( ), not having all the high-order interactions present, the estimations using EPICS can be inaccurate. Nonetheless, when the effective pairwise interactions were captured via EPICS, the abundances in the sub-communities outperformed the estimations made using true pairwise interactions.
[117] In an exemplary embodiment, abundances in at least the 5-member community using the true pairwise interactions may be predicted and compared with EPICS. In an exemplary embodiment, EPICS outperformed true pairwise interactions and that the effective pairwise interactions subsume high-order interactions, which true pairwise interactions miss, and therefore may predict the structure of a microbial community that may contain high-order interactions more accurately. Furthermore, both effective and true pairwise interactions may be used to predict the abundances in all possible sub-communities.
[118] FIG. 15 illustrates exemplary representations of calculating effective pairwise interactions using true interactions in accordance with embodiments of the present disclosure. In an embodiment, the interactions may be known in the representative gut microbial community, and hence Eq. (4) with , , and may be used to calculate an estimate of the effective pairwise interactions from the true interactions as illustrated in FIG. 10. In an exemplary embodiment, equal partitioning of high-order interactions into the associated pairwise terms may be implied by FIG. 10. In another embodiment, many elements in the estimated interactions did not match the effective pairwise interactions inferred using the method. Other partitioning strategies may yield yet other estimates as illustrated in FIG. 15 highlighting the difficulty in obtaining robust estimates of effective pairwise interactions following this approach. As illustrated, non-unique bottom-up partitioning of high-order interactions into effective pairwise interactions may be shown in FIG. 15. The concept using a 3-species microbial community that follows the Generalized Lotka-Volterra model with high-order interactions may be illustrated in FIG. 12B. The dynamical equation for species 1 may be presented. The third-order interaction can be partitioned into and as follow: and , where and may be the respective partitioning fractions. To conserve the growth rates, and must sum to 1. In the FIG. 15, as illustrate at least three but not limited to it, all equivalent, one can obtain such partitions: (1602); (1604); (1606).
[119] In yet another exemplary embodiment, comparison with in vivo data provides a strong test of EPICS and demonstrates its applicability, scalability, and ease-of-use in describing the structures of microbial communities.
[120] In an exemplary implementation, a first set of instructions may be performed by choosing interaction strengths , , and . may range from to , from to , and from to and logarithmic steps with a total of at least 10 discrete values in the range for each parameter but not limited to it may be used. The values constitute interaction box containing a total of points as illustrated in FIG. 5A. In an exemplary embodiment, the number of species in the community, , may be any or a combination of 5, 7, 9, 11, 15, and 20 but not limited to the like. For each point in the interaction box and each value of , interaction tensors (A, B, and C) may be computed by randomly sampling elements as described above, using an adjusted form of the MATLAB© function ‘rand’. The steady-state abundances using the solve function in MATLAB© may then be calculated. The multivariable algebraic equation ( ) may be solved using Levenberg-Marquadt algorithm with at least two different initial conditions but not limited to the like. The solutions converged to a unique set of positive steady-state abundances. The non-stable solutions may be filtered out using the local stability criterion. The steady-state abundances in all leave-one-out communities ay be then calculated and proceeded when all the species in each leave-one-out community stably coexisted. The effective pairwise interactions using the abundances may then be predicted and used to predict the steady-state abundances in the -species communities ( ). Finally, the predicted abundances with the abundances may be compared from bottom-up estimations ( ) in the communities. The procedure may be repeated 300 times for each point in the interaction box and each value of the number of species in the community.
[121] Thus, in an exemplary embodiment, the present disclosure provides for a novel, top-down method (EPICS) to predict structures of microbial communities using minimal experiments and mathematical modeling. In another exemplary embodiment, applicability on at least a 5-member gut microbial community previously cultured in Drosophila melanogaster but not limited to it was performed. EPICS may rely on subsuming high order interactions in effective pairwise interactions and using them to predict community structures. A further enabling feature of EPICS may be that the estimations become increasingly accurate as the number of species in the community increases, making it a promising approach for describing large, natural microbial communities. The advantage may arise because the highest order of interactions present in a community may likely to be increasingly smaller than the number of species in the community as the latter may grow. In yet another exemplary embodiment, as a way of example and not as a limitation, the 5-member community may harbor a 5th order interaction, while a 100-member community may highly be unlikely to harbor a 100th order interaction; the highest order of interaction may likely to be <10 but not limited to it. The leave-one-out sub-communities then tend to reflect effective pairwise interactions increasingly accurately, because the elimination of any one species may be expected to influence the interactions of the remaining species only marginally. EPICS may have outperformed other commonly used bottom-up methods, such as mean-field method, to predict community structures. The choice of the community ecology model may be an important step in correctly inferring the underlying interactions and is, in principle, decided from the knowledge of the community dynamics. For example, the GLV model may have been extensively used to model pairwise interactions in microbial communities. Simple models such as the GLV model may serve as useful alternatives, especially if used in the same time window in which the models may be trained, such as close to the steady-state.
[122] While the foregoing describes various embodiments of the invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. The scope of the invention is determined by the claims that follow. The invention is not limited to the described embodiments, versions or examples, which are included to enable a person having ordinary skill in the art to make and use the invention when combined with information and knowledge available to the person having ordinary skill in the art.

ADVANTAGES OF THE PRESENT DISCLOSURE
[123] Some of the advantages of the present disclosure, which at least one embodiment herein satisfies are as listed herein below.
[124] The present disclosure provides for a method that accurately predicts community structure.
[125] The present disclosure provides for a method that is robust to parameter and model structure variations.
[126] The present disclosure provides for a method that facilitates subsuming high-order interactions between species into effective pairwise interactions using minimal experiments.
[127] The present disclosure provides for method that facilitates experimental effort in a linear scale rather than exponential scale.
[128] The present disclosure provides for that facilitates accurate estimations for large number of species in the community, making it a promising approach for describing large, natural microbial communities.
[129] The present disclosure provides for to facilitate ability to understand and predict structures of large communities using minimal, potentially tractable experiments.

Documents

Application Documents

#	Name	Date
1	202141013762-STATEMENT OF UNDERTAKING (FORM 3) [27-03-2021(online)].pdf	2021-03-27
2	202141013762-REQUEST FOR EXAMINATION (FORM-18) [27-03-2021(online)].pdf	2021-03-27
3	202141013762-POWER OF AUTHORITY [27-03-2021(online)].pdf	2021-03-27
4	202141013762-FORM 18 [27-03-2021(online)].pdf	2021-03-27
5	202141013762-FORM 1 [27-03-2021(online)].pdf	2021-03-27
6	202141013762-DRAWINGS [27-03-2021(online)].pdf	2021-03-27
7	202141013762-DECLARATION OF INVENTORSHIP (FORM 5) [27-03-2021(online)].pdf	2021-03-27
8	202141013762-COMPLETE SPECIFICATION [27-03-2021(online)].pdf	2021-03-27
9	202141013762-Proof of Right [08-05-2021(online)].pdf	2021-05-08
10	202141013762-FORM-8 [26-03-2025(online)].pdf	2025-03-26
11	202141013762-OTHERS [15-04-2025(online)].pdf	2025-04-15
12	202141013762-EDUCATIONAL INSTITUTION(S) [15-04-2025(online)].pdf	2025-04-15