Abstract: The current invention relates to methods and compositions to identify and select male sterile cotton plants displaying genetic male sterility. The invention discloses novel single nucleotide polymorphic markers that are associated closely with ms5ms6 double recessive genotype in cotton plants. These molecular markers are useful for marker-assisted selection and validation of male sterile plants in cotton.
Description:FIELD OF INVENTION
The current invention relates to the field of plant breeding. The current invention more specifically relates to novel single nucleotide polymorphism (SNP) markers linked to genetic male sterility (GMS) trait in cotton. These markers can be used to detect, identify and select male sterile cotton plants.
BACKGROUND
Cotton (Gossypium spp.) is the world's most important textile fiber crop and is also one of the world's most important oilseed crops. Cotton plants provide a source of human food, livestock feed, and raw material in industry. Cotton seed is pressed for cooking oil and the residual cottonseed oil meal is used for animal feed. Cotton is are not only the world’s leading textile fiber and oilseed crop, but it is also of significance for foil energy and bioenergy production. Global demand for cotton products is expected to increase 102% during the period 2000-2030. Decreasing arable land, declining water supplies, and the impact of global climate change, makes maintaining increasing production to meet increasing demands uncertain.
Conventional breeding of cotton, which requires large amounts of water and land, cannot sustain such levels of increasing production without other advanced supporting technologies such as new hybrid cotton plants that may have desirable traits such as disease resistant, less water requirement, and higher yield.
Out of the 50 Gossypium species, four species including G. hirsutum and G. barbadense
allotetraploids and two species G. herbaceum and G. arboreum diploids are cultivated. Upland cotton (G.hirsutum), known for long staple cotton, or Mexican cotton, produces over 90 per cent of the world’s cotton (Ref 6: Mehetre S. J.).
The genus Gossypium is very large, currently comprised of more than 50 species. Two tetraploid species of Gossypium have spinnable seed fibers called lint. These two species
are G. hirsutum (referred to as American Upland cotton) and G. barbadense (referred to as Pima cotton). The upland cotton (Gossypium hirsutum L.) genome (2n=4x=52) is large and complex, requiring a large collection of DNA markers to achieve maximum genome coverage and utility in diverse germplasm. The goal of cotton breeding is to improve cotton plant's performance, and consequently its economic value, by combining various desirable traits into a single plant. Improved performance can include one or more of many desirable traits. For example, higher yields of cotton plants contribute to increased lint fiber production, more profitable agriculture and lower cost of products for the consumer. Improved plant health increases the yield and quality of the plant and reduces the need for application of protective chemicals. Adapting cotton plants to a wider range of production areas achieves improved yield and vegetative growth. Improved plant uniformity enhances the farmer's ability to mechanically harvest cotton.
New cotton varieties can be developed by inbreeding heterozygous plants and practicing selection for superior plants for several generations until substantially homozygous plants are obtained. During the inbreeding process, the vigor of the plant lines decreases and after a sufficient amount of inbreeding, additional inbreeding merely serves to increase seed of the developed variety. Cotton varieties are typically developed for use in the production of hybrid cotton lines.
A cross between two defined substantially homozygous cotton plant varieties always produces a uniform population of heterozygous hybrid cotton plants. When two different, unrelated cotton parent plant varieties are crossed to produce an F1 hybrid, one parent variety is designated as the male, or pollen parent, and the other parent variety is designated as the female, or seed parent. Because cotton plants are capable of self-pollination, hybrid seed production requires elimination of, or inactivation of pollen produced by the female parent to render the female parent plant male sterile to prevent the cotton plant variety designated as the female from self-pollinating. This process is highly labour intensive and add highly significant cost to any breeding programs. Different options exist for controlling male fertility in cotton plants such as physical emasculation, genetic male sterility, cytoplasmic male sterility and application of gametocides. Incomplete removal of male parent plants from a hybrid seed production field before harvest provides the potential for unwanted production of self-pollinated or sib pollinated seed, which can be unintentionally harvested and packaged with hybrid seed.
Thus, for self-pollinating crops such as cotton, producing hybrids is a challenging task.
Efficient hybrid seed production requires that cross-pollination predominates over self-pollination. A major limitation in the production of hybrid seed for many crop species including cotton is the lack of simple, reliable and economical methods of generating sterility in at least one parent (especially the male parent, to result in male-sterility while leaving female gametes intact and accessible for pollination by a suitable pollen donor). Male sterility is also useful where pollen spread is not desirable, e.g., from a domestic plant to its wild relatives, or where flower fertilization is not desirable, e.g., in the case of ornamental flowers which deteriorate in condition after pollination.
The development of new cotton plant varieties and hybrid cotton plants is a slow, costly process which requires high amount of breeding expertise. The development of new varieties and hybrid cotton plants involves numerous steps, including: (1) selection of parent cotton plants (germplasm) for initial breeding crosses; (2) inbreeding of the selected plants from the breeding crosses for several generations to produce a series of varieties, which individually breed true and are highly uniform; and (3) crossing a selected variety with an unrelated variety to produce the F1 hybrid progeny having increased vigor.
Molecular markers can be effectively utilized to characterize the vast germplasm resources of Gossypium spp. Molecular breeding integrated with conventional phenotypic selection is increasingly being utilized for key traits in cotton. A critical requirement for success is availability of large number of easily accessible and polymorphic DNA markers for analyzing in cotton breeding populations, so that the breeding process become cost effective. The low level of polymorphism, especially within upland cotton, necessitates that a large library of DNA markers be available for various applications. Identification of quantitative trait loci (QTL) is one such tool to help in marker-assisted selection (MAS) of resistant genotypes.
There are multiple genes governing genetic male sterility in cotton; ms5 and ms6 are two major loci, which together confer genetic male sterility in homozygous double recessive state.
Molecular breeding is nowadays the method of choice for the utilization of molecular (DNA-based) tools, including markers, to enhance the efficiency of the plant breeding process.
Identification of novel molecular markers associated with any desirable trait is a complex process. Identifying molecular markers linked with genetic male sterility in cotton can pave the way for convenient and less time consuming molecular breeding of cotton plants, by making cross-pollination between inbred varieties easier.
Though there are some reports for molecular markers in cotton associated with other traits such as lint quality, photosensitive male sterility, there are not many molecular markers reported to be associated with genetic male sterility in cotton. Some of the patents reported for cotton molecular markers are Chinese patent applications CN107338298A; CN107338304A ; and CN111778354A: Refs 10, 11 and 12) and Feng et al (Ref 5) described some molecular markers associated with GMS in G.hirsutum.
The current invention discloses novel SNP markers linked with genetic male sterility in cotton plants, and a method of identifying, selecting and breeding cotton plants exhibiting genetic male sterility associated with ms5ms6 double recessive genes.
SUMMARY
One embodiment of the current invention is a SNP marker for identifying and selecting a genetically male sterile cotton plant or germplasm, wherein the SNP marker is an ms5 or ms6 sterile SNP marker allele at a SNP marker locus, and wherein the ms5 sterile SNP marker allele is : SNP1 (TRLCTMS-582) comprising a substitution of "T" at position 63 of SEQ ID NO: 1 in place of “C” ; and wherein the ms6 sterile marker allele is selected from the group consisting of: SNP2 (TRLCTMS-2) comprising a substitution of "C" at position 101 of SEQ ID NO: 3 in place of “T”, and SNP3 comprising a substitution of “C” at position 101 of SEQ ID NO: 5 instead of “T” (TRLCTMS-3 ), and wherein the ms5 sterile marker allele and at least one of the ms6 sterile marker alleles is present in double homozygous state in the genetically male sterile cotton plant.
One embodiment of the current invention is a method of identifying a genetically male sterile cotton plant or germplasm , the method comprising the steps of : detecting in a cotton plant or germplasm the presence of at least one ms5 sterile SNP marker allele and at least one ms6 sterile marker allele as disclosed herein, wherein both the at least one ms5 and the at least one ms6 sterile alleles are present in homozygous state and wherein the double ms5ms6 homozygous state is linked to the male sterile phenotype of the cotton plant or germplasm.
In one embodiment, the method further comprises identifying a fertile maintainer or double heterozygous fertile cotton plant or germplasm, the method comprising the steps of: a) detecting in a cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state and the at least one ms6 sterile marker allele in heterozygous state to identify the double heterozygous plant or germplasm for the at least one ms5 and the at least one ms6 sterile alleles, wherein the double heterozygous plant or germplasm exhibits fertile phenotype; or b) detecting in the cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state or the at least one ms6 sterile marker allele in heterozygous state to identify the single heterozygous plant or germplasm for the at least ms5 and the ms6 sterile marker alleles, wherein the single heterozygous plant or germplasm exhibits fertile phenotype and has the maintainer plant genotype.
In one embodiment, the method further comprises the steps of: a) obtaining DNA from the cotton plant or germplasm; and b) analysing the DNA from step (a) for presence ms5 and ms6 sterile SNP marker alleles.
In one embodiment, the single heterozygous plant or germplasm identified by the method disclosed above is identified as maintainer plant for maintaining genetically male sterile cotton plant lines.
In one embodiment, the current encompasses a method of identifying a male sterile cotton plant or germplasm that displays genetic male sterility, the method comprising the step of: detecting in the germplasm of the cotton plant comprising at least one ms6 male sterile marker, wherein the at least one ms6 marker is located within a chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8. In one embodiment, the above method further comprises the steps of: a) isolating DNA from the cotton plant or germplasm; and b) analyzing the isolated DNA for the presence of at least one ms6 sterile marker allele in the chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8. In one embodiment, the one or ms6 sterile marker alleles identified in this interval are selected from the group consisting of SNP2, and SNP3.
In one embodiment, the current invention encompasses a method of identifying a male sterile, maintainer fertile , or double heterozygous fertile cotton plant or germplasm from a cotton plant or germplasm population produced by crossing a male sterile cotton plant or germplasm with a fertile cotton plant or germplasm, the method comprising the steps of:
a) Crossing a genetically male sterile plant or germplasm which is homozygous for at least one ms5 sterile marker allele SNP1 and at least one ms6 male sterile allele selected from the group consisting of SNP2 and SNP3, as disclosed herein , with a second recurrent fertile parent cotton plant or germplasm to obtain a F1 plant and a segregating progeny F2 plant or germplasm population by selfing of F1 plant; b) Selecting a F2 progeny plant from the segregating progeny F2 plant population from step (a) with: (i) both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F2 progeny plant or germplasm exhibits male sterile phenotype; or (ii) both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F2 plant or germplasm exhibits fertile phenotype; or (iii) either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F2 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm; followed by the step of c) Selecting F3 progeny plants or germplasm from the segregating progeny F3 plant population or germplasm from step (b) with: (i) both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F3 progeny plant or germplasm exhibits male sterile phenotype; or (ii) both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F3 plant or germplasm exhibits fertile phenotype; or (iii) either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F3 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm;
d) repeating selection steps b and c “n” number of times whenever maintainer plants or double heterozygous plants or germplasm are selected, wherein n is 2 to 8 more filial generations, and wherein the single heterozygous alleles and the double heterozygous plants or germplasm are selected; and
e) selecting a sib-mating progeny plant or germplasm population derived from sib-crossing a near-isogenic fertile maintainer plant or germplasm which is heterozygous either at the at least one ms5 or the at least one ms6 sterile marker allele with a sterile plant or germplasm homozygous at both the at least one ms5 and the at least one ms6 marker alleles.
In one embodiment, the second parent plant or germplasm in the method described above is a recurrent parent containing unfavourable alleles at ms5 and ms6, and the method further comprises the steps :
a) Backcrossing the F1 plant or germplasm obtained in step (a) of method described above, with the recurrent parent cotton plant to get BC1F1 progeny plant population or germplasm; and
b) Selecting a progeny plant or germplasm from the segregating BC1F1 progeny plant or germplasm population with both the at least one ms5 and the at least one ms6 sterile marker alleles in heterozygous state, and backcrossing the selected progeny plant or germplasm with the recurrent parent plant or germplasm to produce BC2F1, wherein the double heterozygous plant or germplasm is fertile
c) Repeating step (a)- (b) “n” number of times, wherein “n” is 2 to 5 or more, to obtain BCnF1 progeny plant, followed by selfing of BCnF1 plants to get BCnF2 segregating progeny plant or germplasm; and
d) Selecting BCnF2 near-isogenic recurrent plant type progeny plant or germplasm with maintainer plant that is heterozygous for either the at least one ms5 or the at least one ms6 sterile marker allele, and sib-mating a near isogenic maintainer fertile plant with a sterile plant within family to produce a male sterile plant or germplasm with background genotype of the recurrent parent plant or germplasm.
BRIEF DESCRIPTION OF SEQUENCES
SEQ ID NO: 1 represents the SNP1 sequence, with the ms5 fertile / unfavorable polymorphism at position 63.
SEQ ID NO: 2 represents the SNP1 sequence, with the ms5 sterile / favorable polymorphism at position 63.
SEQ ID NO: 3 represents the SNP2 sequence, with the ms6 fertile / unfavorable polymorphism at position 101.
SEQ ID NO: 4 represents the SNP2 sequence, with the ms6 sterile / favorable polymorphism at position 101.
SEQ ID NO: 5 represents the SNP3 sequence, with the ms6 fertile / unfavorable polymorphism at position 101.
SEQ ID NO: 6 represents the SNP3 sequence, with the ms6 sterile / favorable polymorphism at position 101.
SEQ ID NO: 7 is the 100bp left flanking sequence for a sequence encompassing ms6 SNPs.
SEQ ID NO: 8 is the 100bp right flanking sequence for a sequence encompassing ms6 SNPs.
DETAILED DESCRIPTION
The current invention discloses novel SNP markers for identifying and selecting cotton plants exhibiting genetic male sterility associated with the presence of ms5ms6 double recessive genotype.
The invention encompasses development and validation of novel SNP markers, that are tightly linked to ms5ms6 based genetic male sterility (GMS) trait in cotton which can be used in marker based selection of fertile or sterile phenotypes before flowering, and development of new GMS lines through trait introgression.
Cotton is a dicot plant with perfect flowers, i.e., cotton has male, pollen producing organs and separate female, pollen receiving organs on the same flower. Because cotton has both male and female organs on the same flower, cotton breeding techniques take advantage of the plant's ability to be bred by both self-pollination and cross-pollination. Self-pollination occurs when pollen from the male organ is transferred to a female organ on the same flower on the same plant. Cross-pollination occurs when pollen from the male organ on the flower of one plant is transferred to a female organ on the flower on a different plant. A plant is sib-pollinated (a type of cross-pollination) when individuals within the same family or line are used for pollination (i.e. pollen from a family member plant is transferred to the stigmas of another family member plant). Self-pollination and sib pollination techniques are traditional forms of inbreeding used to develop new cotton varieties, but other techniques exist to accomplish inbreeding.
For self-pollinating crops such as cotton, producing hybrids is a challenging task.
Efficient hybrid seed production requires that cross-pollination predominates over self-pollination. A major limitation in the production of hybrid seed for many crop species including cotton is the lack of simple, reliable and economical methods of generating sterility in at least one parent (especially the male parent, to result in male-sterility; while leaving female gametes intact and accessible for pollination by a suitable pollen donor). Male sterility is a useful trait also when pollen spread is not desirable, e.g., from a domestic plant to its wild relatives.
Self-pollination for several generations produces homozygosity at almost all gene loci, forming a uniform population of true breeding progeny, known as inbreds. Hybrids are developed by crossing two homozygous inbreds to produce heterozygous gene loci in hybrid plants and seeds.
Male sterility can be accomplished by many methods, such as by physical removal of the organs containing the male gametes. this can be highly labor-intensive and therefore expensive process. Physical emasculation is also difficult in many plant species because of the plant's anatomy. Alternative techniques that do not involve manual or physical emasculation are therefore highly desirable.
Self-incompatibility is a form of infertility caused by the failure of cotton plants with normal pollen and ovules to set seed due to some physiological hindrance that prevents fertilization. Self-incompatibility hinders self-pollination and inbreeding and fosters cross-pollination.
Induction of Male Sterility by Male Gametocides: Male sterility can be induced through the use of chemicals, which are commonly known as male gametocides. Some of the chemicals used for induction of male sterility are FW-450 (Sodium B-Dichloro-iso-butyrate) or MH-30 (Maleic hydrazide) and Ethidium bromide (a potent mutagen).
Chemical gametocides have been also described as a method of generating male-sterile plants. Typically, such a chemical gametocide is a herbicidal compound that when applied to a plant at an appropriate developmental stage or before sexual maturity is capable of killing or effectively terminating the development of a plant's male gametes while leaving the plant's female gametes, or at least a significant proportion of them, capable of undergoing cross-pollination. However, the levels of the chemical necessary to kill most of the male gametes while leaving a sufficient number of female gametes still capable of fertilization can result in undesirable effects such as phytotoxicity.
Spraying of aqueous solution of FW-450 or MH30 induces male sterility in cotton. Application of 2-3 dichloro-iso-butyrate at the rate of 1.02 lb per acre shows selective toxicity to the male gametes. Higher concentration of treatments can cause male as well as female sterility and various adverse effects like reduction in yield, boll and seed size and increase in lint percentage (Singh et al, Ref 9) .
Hence, commercial production of hybrid seed using chemical gametocides is limited by their lack of selectivity for gametes, and phytotoxic effects.
Another option for making hybrids is by having inbred parent that comprise a male sterility trait or transgene imparting sterility. This emasculated inbred, often referred to as the female, produces the hybrid seed, F1. The hybrid seed that is produced is heterozygous. However, the grain produced by a plant grown from F1 hybrid seed is referred to as F2 grain. F2 grain which is a plant part produced on the F1 plant will comprise segregating germplasm, even though the hybrid plant is heterozygous. The F1 hybrid plant shows greatly increased vigor and seed yield compared to parent inbreds. Inbred plants on the other hand are mostly homozygous, rendering them less vigorous. Although several genes have been identified, a recessive genic male sterility system based on the ms5 and ms6 alleles seems advantageous to cotton breeders due to its prominent advantages in stable and complete male sterility in hybrid seed production.
Hybrid cotton seed is commonly produced by hand emasculation in one of its parental lines which adds huge labour cost and time. Development of GMS based hybrids in cotton reduces the cost of hybrid seed production by eliminating the hand emasculation process completely. GMS based hybrid seed production can reduce huge labour cost and increase genetic purity. Multiple genes for genetic male sterility trait have been reported in cotton but ms5ms6 double recessive gene based GMS is commonly used for hybrid breeding across world. Conversion of elite fertile lines into GMS recessive homozygous lines through conventional methods needs selfing after each backcross generation to identify double heterozygous fertile plants which leads to higher number of generations in the trait conversion program. Molecular markers linked to GMS could play a critical role to avoid the progeny testing and alternate selfing, and in reducing the breeding cycle in a backcrossing method.
Based on its mode of inheritance, male sterility may be divided into nuclear male sterility (also called genic male sterility; GMS) and cytoplasmic male sterility (CMS). In contrast to CMS and its use for hybrid production, GMS can enhance random mating to develop hybrids since almost all inbred elite and non elite cultivars can be used as its restorer line. The major shortcoming of GMS is that the offspring of GMS plants, pollinated by heterozygous pollinators, theoretically will produce a 1 : 1 male sterile and fertile segregation. Therefore, the fertile offspring have to be scored and eradicated, which is uneconomical if done by traditional means. The use of molecular markers, for marker assisted selection is a very useful tool for identifying and selecting the male sterile progeny before flowering.
The pollen sterility that is caused by nuclear genes is termed as genic or genetic male sterility (GMS). In cotton, GMS has been reported in upland, Egyptian and arboreum cottons. In tetraploid cotton, male sterility is governed by both recessive and dominant genes. So far, 19 GMS genes in tetraploid cotton have been identified, including seven single recessive genes (ms1, ms2, ms3, ms13, ms14, ms15 and ms16), four duplicate/ double recessive genes (ms5ms6 and ms8ms 9) and eight single dominant genes (MS4, MS7, MS10, MS11, MS12, MS 17, MS18 and MS19). However, male sterility governed by recessive genes is used more frequently in plant breeding.
Genetic male sterility is unstable and there are chances of male sterile plants becoming male fertile under low temperature condition. Out of 16 different genes reported in G. hirsutum ms5ms6 is the most stable source. Moreover, in GMS, 50% population is male fertile and the same is identified after flower initiation.
Thus, there is need to use markers in GMS for early identification and removal of fertile plants to save resources and time, quickly convert non GMS lines into GMS versions and use in GMS breeding program to select for desirable combinations to develop new GMS inbred lines
Male sterility has important applications in the development of hybrids. Molecular marker-assisted selection (MAS) is being applied extensively to cotton.
The identification of GMS linked molecular markers can greatly increase the efficiency of transferring GMS alleles to elite inbred lines using MAS. Genetic mapping of GMS genes is also important for map-based cloning in cotton. Identification of molecular markers closely linked to the ms5 and ms6 alleles is useful for effective transferring of male-sterility genes into cultivars or elite lines using marker-assisted backcrossing and forward breeding methods, or validation of male fertile lines. The ms5 and ms6 markers disclosed herein can be used for :
Benefits of the technology/Applications or possible uses of the technology.
• These SNP markers can be used for high throughput genotyping of several thousands of segregating lines before flowering (at any stage of the plant from seed to maturity) within a short period of time for new GMS line development program.
• The GMS SNP markers developed in this invention can be used to identify double heterozygous plants in trait introgression which otherwise require additional selfing.
• Markers can be useful to develop new GMS lines through various breeding methods without the need of specific trait introgression.
• Markers can be used to confirm genetic purity of commercial hybrid/any GMS parental lines sib-ratio segregation and elimination of fertile lines form hybrid seed production fields before flowering (or any stage from seed/seedling to plant maturity).
• GMS marker based hybrid seed production ensures high quality of hybrid seed since all fertile lines can be easily eliminated any stage before flowering.
• Markers can accurately differentiate zygosity at ms5 or ms6 or both and can predict GMS phenotype in Gossypium hirsutum of diverse genetic backgrounds at any generations including inbreds, hybrids or segregating generations.
Maintenance of GMS lines
GMS lines are maintained by sib-mating between fertile and sterile plants within family. The pollination is done by hand. The identification of sterile plants can be done by manual selection or by MAS.
Genetic male sterility (GMS) in cotton mediated by the two homozygous recessive genes, ms5ms5 and ms6ms6, is expressed as non-dehiscent anthers and unviable pollen grains. Sequence analysis on ms5 and ms6 loci in Gossypium hirsutum can be conducted to reveal genomic variation at these two loci between GMS and wild-type G. hirsutum inbred lines, and sequence polymorphism linked to ms5 on A12 and ms6 on D12 can lead to identification of novel markers.
Several molecular markers including SNPs and gel based markers have been reported for ms5ms6 GMS in cotton. SNP markers require high-throughput platform for genotyping and also require stringent haplotype validation in breeding germplasm before deployment. GMS markers can be deployed directly for breeding populations development, trait introgression, forward breeding and GMS female inbred segregation QC studies for commercial hybrid GMS inbred parents.
In the conventional GMS line conversion breeding, double heterozygous plants during backcross F1 (BCnF1) generations can be identified only by testing selfed progeny of all plants. Backcrossing alternating with self-pollination is required to identify plants with the double heterozygous genotype in order to maintain ms5 and ms6 genes in BCnF1 generations. However, markers can completely eliminate this step and identify double heterozygous plants at BCnF1 or any other generations accurately at any stage of the plant. Plants that are heterozygous for either ms5 or ms6 favorable alleles disclosed herein are single heterozygous maintainer plants, and have fertile phenotype. These maintainer plants can be used to maintain the male sterile plants. Plants that are homozygous for both ms5 and at least one of the ms6 alleles disclosed herein are double homozygous and are phenotypically sterile, and genetically male sterile.
Definitions:
As used herein, the term "cotton plant" includes whole cotton plants, cotton plant cells, cotton plant protoplast, cotton plant cell or cotton tissue culture from which cotton plants can be regenerated, cotton plant calli, cotton plant clumps and cotton plant cells that are intact in cotton plants or parts of cotton plants, such as cotton seeds, cotton cobs, cotton flowers, cotton leaves, cotton stems, cotton buds, cotton roots, cotton root tips and the like.
“Self-pollination” occurs when pollen from the male organ is transferred to a female organ on the same flower on the same plant.
“Self-incompatibility” is a form of infertility caused by the failure of cotton plants with normal pollen and ovules to set seed due to some physiological hindrance that prevents fertilization. Self-incompatibility restricts self-pollination and inbreeding and fosters cross-pollination.
“Cross-pollination” occurs when pollen from the male organ on the flower of one plant is transferred to a female organ on the flower on a different plant.
As used herein, a “male sterile plant” is a plant that does not produce male gametes that are viable or otherwise capable of fertilization.
The term “female” refers to a plant that produces ovules. Female plants generally produce seeds after fertilization. A plant designated as a “female plant” may contain both male and female sexual organs or may only contain female sexual organs either naturally or due to emasculation by physical or chemical means, or from male-sterility.
The term “male” refers to a plant that produces pollen grains. The “male plant” generally refers to the sex that produces male gametes for fertilizing ova. A plant designated as a “male plant” may contain both male and female sexual organs, or may only contain male sexual organs either naturally (e.g., in dioecious species) or due to emasculation (e.g., by removing the ovary).
A plant is sib-pollinated (a type of cross-pollination) when individuals within the same family or line are used for pollination (for example, pollen from a family member plant is transferred to the stigmas of another family member plant).
Self-pollination and sib pollination techniques are traditional forms of inbreeding used to develop new cotton varieties, but other techniques exist to accomplish inbreeding.
GMS lines can be maintained by sibmating between fertile and sterile plants. The pollination is manual. The identification of sterile plants done by trained personnel, and thus 50% plant population is eliminated.
A “maintainer plant” or “maintainer line” as defined herein is a plant that is heterozygous for either ms5 or ms6 and homozygous recessive vice versa for ms5 or ms6.
New cotton varieties are developed by inbreeding heterozygous plants and practicing selection for superior plants for several generations until substantially homozygous plants are obtained. During the inbreeding process with cotton, the vigor of the lines decreases and after a sufficient amount of inbreeding, additional inbreeding merely serves to increase seed of the developed variety. Cotton varieties are typically developed for use in the production of hybrid cotton lines.
Vigor is restored when two different varieties are cross pollinated to produce the first generation (FI) progeny. A cross between two defined substantially homozygous cotton plant varieties always produces a uniform population of heterozygous hybrid cotton plants and such hybrid cotton plants are capable of being generated indefinitely from the corresponding variety cotton seed supply. When two different, unrelated cotton parent plant varieties are crossed to produce an F1 hybrid, one parent variety is designated as the male, or pollen parent, and the other parent variety is designated as the female, or seed parent. The development of new cotton plant varieties and hybrid cotton plants is a slow, costly interrelated process that requires the expertise of breeders and many other specialists. The development of new varieties and hybrid cotton plants in a cotton plant breeding program involves numerous steps, including: (1) selection of parent cotton plants (germplasm) for initial breeding crosses; (2) inbreeding of the selected plants from the breeding crosses for several generations to produce a series of varieties, which individually breed true and are highly uniform; and (3) crossing a selected variety with an unrelated variety to produce the F 1 hybrid progeny having restored vigor
As used herein, a "polymorphism" is a variation in the DNA between two or more individual plants within a population. A polymorphism preferably has a frequency of at least 1 % in a population. A useful polymorphism can include a single nucleotide polymorphism (SNP), a simple sequence repeat (SSR), or an insertion/deletion polymorphism, also referred to herein as an "indel".
As used herein, the term "allele" refers to one of two or more different nucleotide sequences that occur at a specific locus.
An allele is "associated with" a trait when presence of the particular allele is part of or linked to a DNA sequence or allele is correlated with the expression of the trait.
An allele "negatively" correlates with a trait when it is linked to it and when presence of the allele is an indicator that a desired trait or trait form will not occur in a plant comprising the allele.
An allele "positively" correlates with a trait when it is linked to it and when presence of the allele is an indicator that the desired trait or trait form will occur in a plant comprising the allele.
As used herein, a "favourable allele" is the allele at a particular locus that confers, or contributes to, an agronomically desirable phenotype, examples of such traits may be disease resistance, resistance to herbicides, etc; the favourable allele allows the identification of plants with that agronomically desirable phenotype. A favourable allele of a marker is a marker allele that segregates with the favourable phenotype.
As used herein, the term “favorable allele” is used interchangeably with the term “sterile allele”. A favorable or sterile marker allele disclosed herein is associated with either ms5 or ms6 sterile genotype. A double recessive homozygous genotype of ms5ms5ms6ms6 is associated with a male sterile phenotype.
As used herein, the double heterozygous plant or germplasm refers to a plant or germplasm having both the ms5 and ms6 sterile marker alleles in heterozygous state. Plants having such genotype are fertile plants.
As used herein, the single heterozygous plant or germplasm refers to a plant or germplasm having one of either the ms5 or ms6 sterile marker alleles in heterozygous state. Plants having such genotype are fertile maintainer plants.
The genotypes of all three genotypes in the cotton plants described herein are given below in Table 1. The small letters denote recessive allele (Ref 3: Chen et al) which as the sterile allele genotype, and the caps letters depict the dominant allele which have fertile allele genotype (for the current invention, the sterile marker alleles are denoted by SEQ ID Nos: 2, 4 and 6 for SNP1, SNP2 and SNP3 respectively; and the fertile marker alleles are denoted by SEQ ID Nos: 1, 3 and 5 for SNP1, SNP2 and SNP3 respectively).
Table 1
Male sterile ms5ms5ms6ms6
Maintainer Ms5ms5ms6ms6 or ms5ms5Ms6ms6,
Double heterozygous Ms5ms5Ms6ms6
As used herein, the term "locus" refers to a position on a chromosome, e.g. where a nucleotide, gene, sequence, or marker is located.
As used herein, the term "marker locus" refers to a specific chromosome location in the genome of a species where a specific marker can be found.
Closely linked loci such as a marker locus and a second locus can display an inter-locus recombination frequency of 10% or less, preferably about 9% or less, still more preferably about 8% or less, yet more preferably about 7% or less, still more preferably about 6% or less, yet more preferably about 5% or less, still more preferably about 4% or less, yet more preferably about 3% or less, and still more preferably about 2% or less.
Two loci that are localized to the same chromosome, and at such a distance that recombination between the two loci occurs at a frequency of less than 10% (e.g., about 9 %, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1 %, 0.75%, 0.5%, 0.25%, or less) are also said to be "proximal to" each other. In some cases, two different markers can have the same genetic map coordinates. In that case, the two markers are in such close proximity to each other that recombination occurs between them with such low frequency that it is undetectable.
As used herein, the term “molecular markers” or "Genetic markers" refers to nucleic acids that are polymorphic in a population. The term includes nucleic acid sequences complementary to the genomic sequences, such as nucleic acids used as probes. Markers corresponding to genetic polymorphisms between members of a population can be detected by methods well- established in the art. These include, e.g., PCR-based sequence specific amplification methods, detection of restriction fragment length polymorphisms (RFLP), detection of isozyme markers, detection of polynucleotide polymorphisms by allele specific hybridization (ASH), dynamic allele-specific hybridization (DASH), molecular beacons, microarray hybridization, oligonucleotide ligase assays, Flap endonucleases, 5' endonucleases, primer extension, single strand conformation polymorphism (SSCP) or temperature gradient gel electrophoresis (TGGE). . DNA sequencing, such as the pyrosequencing technology has the advantage of being able to detect a series of linked SNP alleles that constitute a haplotype.
Well established methods are also known for the detection of expressed sequence tags (ESTs) and SSR markers derived from EST sequences and randomly amplified polymorphic DNA (RAPD).
Techniques for DNA Isolation and Analysis:
DNA Isolation can be done by any of the very well known methods in literature.
Single nucleotide polymorphism (SNP) data can be obtained using any one of the known uniplex or multiplex SNP genotyping platforms that combine a variety of chemistries, detection methods, and reaction formats. Advances in high-throughput genotyping have made the generation of genome-scale data much more easier and cost-effective than before.
Next-generation sequencing (NGS) technologies can be used to detect large numbers of SNPs in breeding populations. A rise in the number of available SNP markers has led to increased demand for SNP genotyping capabilities, resulting in numerous cost-effective genotyping platforms available to researchers and breeders.
The NGS provides much higher performance and throughput than the previously used Sanger sequencing technique. NGS provides inexpensive whole genome sequence readings through methods, such as chromatin immunoprecipitation, mutation mapping, polymorphism detection and detection of non-coding RNA sequences. Sequencing methods such as: Restriction site associated DNA (RADseq), multiplexed shotgun genotyping (MSG) and bulked segregant RNA-Seq (BSRSEq) enable the identification of a significant number of markers and more accurate examination of many loci in a small number of samples.
Another genotyping-by-sequencing method commonly used nowadays is DArTseq™. The DArTseq™ represents a combination of a DArT (Diversity arrays Technology) complexity reduction methods and next generation sequencing platforms ( Ref 8: Sansaloni, C et al). The DArTseq procedure is used, among others, to identify single nucleotide markers (SNPs).
Some of the other methods for SNP genotyping is the TaqMan system (Applied Biosystems, Foster City, CA) based on fluorescently-tagged, allele-specific probes detected using real-time polymerase chain reaction (PCR)-based assays. Another preferred SNP genotyping technology is Kompetitive allele specific PCR (KASP), which uses endpoint fluorescence detection to discriminate tagged alleles. The TaqMan and the KASP assays are widely used for genotyping due to their high-throughput, low cost, sensitivity and tolerance of variation in the quality and quantity of input DNA.
Another method called rhAmp based on RNase H2-dependent PCR (rhPCR) uses RNase H2 to activate primers after successful binding to their target sites, reducing primer dimer formation and improving the specificity of the reaction. (Ref 2: Broccanello, C et al); (Ref 1 :Ayalew H et al)
Kompetitive Allele Specific PCR (KASP) is one of the uniplex SNP genotyping platforms, and is one of the most reliable technologies for SNP genotyping.
As used herein, the term, "marker allele", used interchangeably with the term "allele of a marker locus", can refer to one of a plurality of polymorphic nucleotide sequences found at a marker locus in a population.
As used herein, the term "Marker assisted selection" (of MAS) refers to a process by which individual plants are selected based on marker genotypes. The particular marker genotypes may be linked to specific desirable agronomic traits.
Marker-assisted selection (MAS):
Molecular markers can be used in a variety of plant breeding applications A molecular marker that demonstrates linkage with a locus affecting a desired phenotypic trait provides a useful tool for the selection of the trait in a plant population. This is very useful where the phenotype is hard to assay, for example, disease resistance traits. Since DNA marker assays are less laborious and less time and space- consuming than field phenotyping, much larger populations can be assayed, increasing the chances of finding a recombinant with the target segment from the donor line moved to the recipient line.
The closer the linkage, the more useful the marker, as recombination is less likely to occur between the marker and the gene causing the trait, which can result in false positives. Having flanking markers decreases the chances that false positive selection will occur as a double recombination event would be needed.
The ideal situation is to have a marker in the gene itself, so that recombination cannot occur between the marker and the gene. Such a marker is called a' perfect marker'.
As used herein, the term "Marker assisted counter-selection" refers to a process by which marker genotypes are used to identify plants that will not be selected, allowing them to be removed from a breeding program or planting.
As used herein, the terms “foreground selection” and “forward selection” are used interchangeably, and refer to selecting plants having the marker/ favourable allele of the donor parent at the target locus.
As used herein, the term, the term "haplotype" is the genotype of an individual at a plurality of genetic loci, i.e. a combination of alleles. Typically, the genetic loci described by a haplotype are physically and genetically linked, i.e., on the same chromosome segment. The term "haplotype" can refer to alleles at a particular locus, or to alleles at multiple loci along a chromosomal segment.
As used herein, the term "marker haplotype" refers to a combination of marker alleles at a marker locus. In general, marker haplotypes may be on different chromosomes together contributing for a same trait
As used herein, the term "complement" refers to a nucleotide sequence that is complementary to a given nucleotide sequence.
As used herein, the term "contiguous DNA" refers to an uninterrupted stretch of genomic DNA represented by partially overlapping pieces or contigs.
As used herein, the term "heterogeneity" is used to indicate that individuals within the group differ in genotype at one or more specific loci.
As used herein, a centimorgan ("cM") is a unit of measure of recombination frequency. One cM is equal to a 1% chance that a marker at one genetic locus will be separated from a marker at a second locus due to crossing over in a single generation.
As used herein, the term "chromosomal interval" designates a contiguous linear span of genomic DNA on a single chromosome. The genetic elements or genes located on a single chromosomal interval are physically linked. The size of a chromosomal interval is not particularly defined or limited. In some aspects, the genetic elements located within a single chromosomal interval are genetically linked, typically with a genetic recombination distance of, for example, less than or equal to 20 cM, or alternatively, less than or equal to 10 cM. Thus, two genetic elements within a single chromosomal interval undergo recombination at a frequency of less than or equal to 20% or 10%.
As used herein, the term "closely linked", means that recombination between two linked loci occurs with a frequency of equal to or less than about 10% (i.e., are separated on a genetic map by not more than 10 cM). Thus, the closely linked loci co-segregate at least 90% of the time.
SNPs disclosed herein can be detected by any of the methods known in art, examples of which include, but are not limited to, DNA sequencing, PCR-based sequence specific amplification methods, detection of polynucleotide polymorphisms by allele specific hybridization (ASH), dynamic allele-specific hybridization (DASH), molecular beacons, microarray hybridization, oligonucleotide ligase assays, Flap endonucleases, 5' endonucleases, primer extension, single strand conformation polymorphism (SSCP) or temperature gradient gel electrophoresis (TGGE). DNA sequencing, such as the pyrosequencing technology has the advantage of being able to detect a series of linked SNP alleles that constitute a haplotype.
As used herein, the term "probe" refers to a nucleic acid sequence or molecule that can be used to identify the presence of a specific DNA or protein sequence; e.g., a nucleic acid probe that is complementary to a marker locus sequence, through nucleic acid hybridization.
As used herein, the term "Fragment" refers to a portion of a nucleotide sequence.
As used herein, the term "phenotype", "phenotypic trait", or "trait" refer to the observable expression of a gene or series of genes. The phenotype can be observable to the naked eye, or by any other means of evaluation known in the art, e.g., weighing, counting, measuring (length, width, angles, etc.), microscopy, biochemical analysis, or an electromechanical assay.
In some cases, a phenotype is directly controlled by a single gene or genetic locus, i.e., a "single gene trait" or a "simply inherited trait". In the absence of large levels of environmental variation, single gene traits can segregate in a population to give a "qualitative" or "discrete" distribution, which means that the phenotype falls into discrete classes. In other cases, a phenotype is the result of several genes and can be considered a "multigenic trait" or a "complex trait".
As used herein, the term "crossed" or "cross" refers to a sexual cross and involves the fusion of two haploid gametes via pollination to produce diploid progeny (e.g., cells, seeds or plants). The term encompasses both the pollination of one plant by another and selfing (or self-pollination, e.g., when the pollen and ovule are from the same plant).
As used herein, the term "Backcrossing" refers to the process whereby hybrid progeny are repeatedly crossed back to one of the parents. In a backcrossing scheme, the "donor" parent refers to the parental plant with the desired gene/genes, locus/loci, or specific phenotype to be introgressed. The "recipient" parent (used one or more times) or "recurrent" parent (used two or more times) refers to the parental plant into which the gene or locus is being introgressed.
As used herein, the term "elite line" refers to any line that has resulted from breeding and selection for superior agronomic performance.
As used herein, the term "genetic map" refers to a representation of genetic linkage relationships among loci on one or more chromosomes (or linkage groups) within a given species, generally depicted in a diagrammatic or tabular form. For each genetic map, distances between loci are measured by how frequently their alleles appear together in a population (their recombination frequencies). Alleles can be detected using DNA or protein markers, or observable phenotypes. A genetic map is a product of the mapping population, types of markers used, and the polymorphic potential of each marker between different populations. Genetic distances between loci can differ from one genetic map to another. Information can be correlated from one genetic map to another using common markers. One of ordinary skill in the art can use common marker positions to identify positions of markers and other loci of interest on each individual genetic map. The order of loci does change between maps, although frequently there may be small changes in marker orders due to reasons such as markers detecting alternate duplicate loci in different populations, differences in statistical approaches used to order the markers, novel mutation or laboratory error.
As used herein, the term "genetic map location" is a location on a genetic map relative to surrounding genetic markers on the same linkage group where a specified marker can be found within a given species.
As used herein, the term "Germplasm" refers to genetic material of or from an individual (e.g., a plant), a group of individuals (e.g., a plant line, variety or family), or a clone derived from a line, variety, species, or culture, or more generally, all individuals within a species or for several species (e.g., cotton germplasm collection or Andean germplasm collection). In general, germplasm provides genetic material with a specific molecular makeup that provides a physical foundation for some or all of the hereditary qualities of an organism or cell culture.
As used herein, the term "haploid" refers to a plant that has a single set (genome) of chromosomes.
As used herein, the term "hybrid" refers to the progeny obtained between the crossing of at least two genetically dissimilar parents.
As used herein, the term "inbred" refers to a line that has been bred for genetic homogeneity.
As used herein, the term "indel" refers to an insertion or deletion, wherein one line may be referred to as having an inserted nucleotide or piece of DNA relative to a second line or the second line may be referred to as having a deleted nucleotide or piece of DNA relative to the first line.
As used herein, the term "introgression" refers to the transmission of a desired allele of a genetic locus from one genetic background to another. For example, introgression of a desired allele at a specified locus can be transmitted to at least one progeny via a sexual cross between two parents of the same species, where at least one of the parents has the desired allele in its genome.
The process of "introgressing" is also referred to as "backcrossing" when the process is repeated two or more times.
As used herein, the term "linkage" is used to describe the degree with which one marker locus is associated with another marker locus or some other locus. The linkage relationship between a molecular marker and a locus affecting a phenotype is given as a "probability" or "adjusted probability".
As used herein, the term "linkage disequilibrium" (or LD) refers to a non-random segregation of genetic loci or traits (or both). In either case, linkage disequilibrium implies that the relevant loci are within sufficient physical proximity along a length of a chromosome so that they segregate together with greater than random (i.e., non- random) frequency. Markers that show linkage disequilibrium are considered linked. Linked loci co-segregate more than 50% of the time, e.g., from about 51 % to about 100% of the time. In other words, two markers that co-segregate have a recombination frequency of less than 50% (and by definition, are separated by less than 50 cM on the same linkage group.) As used herein, linkage can be between two markers, or alternatively between a marker and a locus affecting a phenotype.
A marker locus can be "associated with" (linked to) a trait. The degree of linkage of a marker locus and a locus affecting a phenotypic trait is measured, e.g., as a statistical probability of co-segregation of that molecular marker with the phenotype (e.g., an F statistic or LOD score).
As used herein, "linkage equilibrium" describes a situation where two markers independently segregate, i.e., sort among progeny randomly. Markers that show linkage equilibrium are considered unlinked (whether or not they lie on the same chromosome).
The "logarithm of odds (LOD) value" or "LOD score" (Ref 7 : Risch, et al) is used in genetic interval mapping to describe the degree of linkage between two marker loci. A LOD score of three between two markers indicates that linkage is 1000 times more likely than no linkage, while a LOD score of two indicates that linkage is 100 times more likely than no linkage. LOD scores greater than or equal to two may be used to detect linkage. LOD scores can also be used to show the strength of association between marker loci and quantitative traits in "quantitative trait loci" mapping. In this case, the LOD score's size is dependent on the closeness of the marker locus to the locus affecting the quantitative trait, as well as the size of the quantitative trait effect.
As used herein, the term, "probability value" or "p-value" is the statistical likelihood that the particular combination of a phenotype and the presence or absence of a particular marker allele is random. Thus, the lower the probability score, the greater the likelihood that a locus and a phenotype are associated. The probability score can be affected by the proximity of the first locus (usually a marker locus) and the locus affecting the phenotype, plus the magnitude of the phenotypic effect (the change in phenotype caused by an allele substitution). In some aspects, the probability score is considered "significant" or "nonsignificant". In some embodiments, a probability score of 0.05 (p=0.05, or a 5% probability) of random assortment is considered a significant indication of association. However, an acceptable probability can be any probability of less than 50% (p=0.5). For example, a significant probability can be less than 0.25, less than 0.20, less than 0.15, less than 0.1, less than 0.05, less than 0.01, or less than 0.001 .
As used herein, the term, "production marker" or "production SNP marker" refers to a marker that has been developed for high-throughput purposes. Production SNP markers are developed to detect specific polymorphisms and are designed for use with a variety of chemistries and platforms.
As used herein, the term, "quantitative trait locus" or "QTL" refers to a region of DNA that is associated with the differential expression of a quantitative phenotypic trait in at least one genetic background, e.g., in at least one breeding population. The region of the QTL encompasses or is closely linked to the gene or genes that affect the trait in question.
An "allele of a QTL" (or "QTL allele") can comprise multiple genes or other genetic factors within a contiguous genomic region or linkage group. An allele of a QTL can be defined by a haplotype within a specified window wherein said window is a contiguous genomic region that can be defined, and tracked, with a set of one or more polymorphic markers. The haplotype is then defined by the unique fingerprint of alleles at each marker within the specified window.
As used herein, the term "reference sequence" or a "consensus sequence" refers to a defined sequence used as a basis for sequence comparison. The reference sequence for the SNP markers disclosed herein refer to sequences obtained by / from Gossypium hirsutum UTX_v2.1 reference genome (Ref 4: Chen et al.).
As used herein, the terms "agronomic traits", and "plant trait or characteristic" are used interchangeably and refer to the traits and associated genotype that ultimately lead to higher yield but encompass any plant characteristic that can lead to higher plant health and yield, such as herbicide resistance, emergence vigour, vegetative vigour, stress tolerance, disease resistance or tolerance, herbicide resistance, branching, flowering, seed set, seed size, seed density, standability, threshability, male sterility and the like.
The current invention encompasses genetic male sterility as the desirable trait for the cotton plants being selected by the method disclosed herein. More specifically, the molecular markers disclosed herein are closely associated with, and help select cotton plants which have the double recessive genotype ms5ms5ms6ms6; and which phenotypically display male sterility.
Embodiments:
The present invention relates to identifying and selecting cotton plants that are male sterile, and the male sterility is conferred by the presence of ms5ms6 double recessive genes. The current invention encompasses molecular markers for identifying and selecting genetically male sterile cotton plants, and all genetic configurations expected in any sterile x fertile segregating population. The invention further relates to methods for generating such cotton plants, as well as the cotton plants generated by the methods disclosed herein. The invention further relates to the use of such male sterile cotton plants in making cotton hybrids by cross-pollinating with other inbred plant varieties. The invention also relates to cotton plants or plant parts thereof, obtained by or obtainable by the method as described herein, as well as cotton plants or plant parts thereof comprising the marker alleles described herein.
The current invention discloses novel SNP markers tightly linked to genetic male sterility (gms) trait in cotton which can be used in marker-based selection of fertile or sterile phenotypes before flowering or at any stage from seed to maturity, and can be used to convert non-GMS lines into GMS lines through marker-assisted selection, develop new GMS inbred lines through breeding, aid in quality control/ validation of GMS inbred parents of GMS based hybrids during seed production.
The SNP markers disclosed herein can be used for:
• GMS trait phenotype prediction before flowering or at any stage for hybrid seed production
• Accelerated GMS line conversion through markers
• Development of new GMS inbreds through breeding
• Accurate quality control or validation of commercial hybrid GMS parental lines for genetic purity
• Improving the genetic purity of GMS hybrids by GMS markers based QC and elimination of all fertile plants before seed production
• Elimination of selfing in back cross conventional trait introgression breeding by foreground marker based advancements
One embodiment of the current invention is a SNP marker for identifying and selecting a genetically male sterile cotton plant or germplasm, wherein the SNP marker is an ms5 or ms6 sterile SNP marker allele at a SNP marker locus, and wherein the ms5 sterile SNP marker allele is : SNP1 (TRLCTMS-582) comprising a substitution of "T" at position 63 of SEQ ID NO: 1 in place of “C” ; and wherein the ms6 sterile marker allele is selected from the group consisting of: SNP2 (TRLCTMS-2) comprising a substitution of "C" at position 101 of SEQ ID NO: 3 in place of “T”, and SNP3 comprising a substitution of “C” at position 101 of SEQ ID NO: 5 instead of “T” (TRLCTMS-3 ), and wherein the ms5 sterile marker allele and at least one of the ms6 sterile marker alleles is present in double homozygous state in the genetically male sterile cotton plant.
One embodiment of the current invention is a method of identifying a genetically male sterile cotton plant or germplasm , the method comprising the steps of : detecting in a cotton plant or germplasm the presence of at least one ms5 sterile SNP marker allele and at least one ms6 sterile marker allele as disclosed herein, wherein both the at least one ms5 and the at least one ms6 sterile alleles are present in homozygous state and wherein the double ms5ms6 homozygous state is linked to the male sterile phenotype of the cotton plant or germplasm.
In one embodiment, the method further comprises identifying a fertile maintainer or double heterozygous fertile cotton plant or germplasm, the method comprising the steps of: a) detecting in a cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state and the at least one ms6 sterile marker allele in heterozygous state to identify the double heterozygous plant or germplasm for the at least one ms5 and the at least one ms6 sterile alleles, wherein the double heterozygous plant or germplasm exhibits fertile phenotype; or b) detecting in the cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state or the at least one ms6 sterile marker allele in heterozygous state to identify the single heterozygous plant or germplasm for the at least ms5 and the ms6 sterile marker alleles, wherein the single heterozygous plant or germplasm exhibits fertile phenotype and has the maintainer plant genotype.
In one embodiment, the method further comprises the steps of: a) obtaining DNA from the cotton plant or germplasm; and b) analysing the DNA from step (a) for presence ms5 and ms6 sterile SNP marker alleles.
In one embodiment, the single heterozygous plant or germplasm identified by the method disclosed above is identified as maintainer plant for maintaining genetically male sterile cotton plant lines.
In one embodiment, the current encompasses a method of identifying a male sterile cotton plant or germplasm that displays genetic male sterility, the method comprising the step of: detecting in the germplasm of the cotton plant comprising at least one ms6 male sterile marker, wherein the at least one ms6 marker is located within a chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8. In one embodiment, the above method further comprises the steps of: a) isolating DNA from the cotton plant or germplasm; and b) analyzing the isolated DNA for the presence of at least one ms6 sterile marker allele in the chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8. In one embodiment, the one or ms6 sterile marker alleles identified in this interval are selected from the group consisting of SNP2, and SNP3.
In one embodiment, the current invention encompasses a method of identifying a male sterile, maintainer fertile , or double heterozygous fertile cotton plant or germplasm from a cotton plant or germplasm population produced by crossing a male sterile cotton plant or germplasm with a fertile cotton plant or germplasm, the method comprising the steps of:
a) Crossing a genetically male sterile plant or germplasm which is homozygous for at least one ms5 sterile marker allele SNP1 and at least one ms6 male sterile allele selected from the group consisting of SNP2 and SNP3, as disclosed herein , with a second recurrent fertile parent cotton plant or germplasm to obtain a F1 plant and a segregating progeny F2 plant or germplasm population by selfing of F1 plant; b) Selecting a F2 progeny plant from the segregating progeny F2 plant population from step (a) with: (i) both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F2 progeny plant or germplasm exhibits male sterile phenotype; or (ii) both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F2 plant or germplasm exhibits fertile phenotype; or (iii) either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F2 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm; followed by the step of c) Selecting F3 progeny plants or germplasm from the segregating progeny F3 plant population or germplasm from step (b) with: (i) both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F3 progeny plant or germplasm exhibits male sterile phenotype; or (ii) both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F3 plant or germplasm exhibits fertile phenotype; or (iii) either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F3 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm;
d) repeating selection steps b and c “n” number of times whenever maintainer plants or double heterozygous plants or germplasm are selected, wherein n is 2 to 8 more filial generations, and wherein the single heterozygous alleles and the double heterozygous plants or germplasm are selected; and
e) selecting a sib-mating progeny plant or germplasm population derived from sib-crossing a near-isogenic fertile maintainer plant or germplasm which is heterozygous either at the at least one ms5 or the at least one ms6 sterile marker allele with a sterile plant or germplasm homozygous at both the at least one ms5 and the at least one ms6 marker alleles.
In one embodiment, the second parent plant or germplasm in the method described above is a recurrent parent containing unfavourable alleles at ms5 and ms6, and the method further comprises the steps :
a) Backcrossing the F1 plant or germplasm obtained in step (a) of method described above, with the recurrent parent cotton plant to get BC1F1 progeny plant population or germplasm; and
b) Selecting a progeny plant or germplasm from the segregating BC1F1 progeny plant or germplasm population with both the at least one ms5 and the at least one ms6 sterile marker alleles in heterozygous state, and backcrossing the selected progeny plant or germplasm with the recurrent parent plant or germplasm to produce BC2F1, wherein the double heterozygous plant or germplasm is fertile,
c) Repeating step (a)- (b) “n” number of times, wherein “n” is 2 to 5 or more, to obtain BCnF1 progeny plant, followed by selfing of BCnF1 plants to get BCnF2 segregating progeny plant or germplasm; and
d) Selecting BCnF2 near-isogenic recurrent plant type progeny plant or germplasm with maintainer plant that is heterozygous for either the at least one ms5 or the at least one ms6 sterile marker allele, and sib-mating a near isogenic maintainer fertile plant with a sterile plant within family to produce a male sterile plant or germplasm with background genotype of the recurrent parent plant or germplasm.
Table 2: shows the Sequence ID numbers corresponding to the different SNP marker loci for ms5 and ms6 sterile marker alleles disclosed herein
SNP marker Chromosome Allele Genotype SEQ ID NO
SNP1; TRLCTMS-582 (ms5) A12 fertile CC 1
sterile TT 2
SNP2; TRLCTMS-2 (ms6) D12 fertile TT 3
sterile CC 4
SNP3; TRLCTMS-3 (ms6)
D12 fertile TT 5
sterile CC 6
Left flanking sequence on chromosome D12 (ms6 loci) D12 7
Right flanking sequence on chromosome D12 (ms6 loci) D12 8
Examples
Example 1:
Ms5 SNP markers development and validation
A GMS sib-population (R23G) was used to identify SNPs within a ms5 physical region/interval co-segregating with fertility and sterility phenotypes. A 4.2 kb fragment on chromosome A12 near ms5 locus was amplified in individual fertile and sterile plants using overlapping PCR primers. PCR amplicon from all individual fertile and sterile plants was separately purified from the agarose gel and sequenced with Sanger sequencing methods. Sanger sequencing revealed 21 SNPs in 4.2 kb amplicon. 10 SNPs were shortlisted and further tested in individual fertile and sterile plants of the R23G population which were used for amplicon sequencing.
Based on marker-trait association results in the above experiment, we shortlisted two SNPs for second round validation using a large cotton germplasm set comprising of 360 lines representing six diverse GMS genetic backgrounds and elite inbred lines. We found overall prediction accuracy of ms5 phenotype-fertile/sterile in GMS backgrounds with 99.4% accuracy. ms5-SNPs showed very tight linkage desirable for breeding selections across genetic backgrounds. Ms5 SNPs also revealed very high marker polymorphism between GMS lines and normal inbred lines which is ideal for GMS line development using marker-assisted selection. All plants which are heterozygous for ms5 sterile alleles are found to be fertile and homozygous found to be sterile.
These ms5-SNP was also further validated in combination with ms6-SNPs by using more than 1500 lines consisting of both GMS derived genetic populations at various stages and inbred lines. Marker-assisted selection (MAS) for ms5/MS5 SNPs showed very high phenotype prediction accuracy and consistency.
Example 2:
ms6 SNP markers identification and validation using high density GBS sequencing
A set of 192 cotton inbred lines representing wild type elite fertile inbreds and 51 GMS lines from sib-populations were genotyped with high-density GBS sequencing. 68973 high quality SNP markers derived from GBS data were used to identify marker trait association using GWAS-MLM (mixed linear model) analysis. We identified two SNP markers on chromosome D12 tightly co-segregating with ms6 gene with more than 99% phenotype prediction accuracy. We found a few inbred lines in the germplasm are already fixed for ms6 sterile gene based on marker profile and subsequent field phenotypic observations. Two representative ms6 favourable allele fixed inbreds were further validated by additional populations phenotyping (developed by crossing ms6 favourable allele fixed inbred lines with GMS sterile lines).
Example 3:
Confirmation of ms6-SNP marker accuracy for ms6 segregating breeding and sib populations
We used two F2 populations derived from crossing two fertile inbreds containing ms6 favourable SNP homozygous alleles with GMS-sterile plants from R41G genetic background. The F2 is expected to segregate only for ms5 locus in 3:1 ratio since ms6 favourable alleles are present in homozygous state in both parental lines. GMS flowering data (fertile vs sterile) recorded in these two F2 populations showed 3:1 phenotypic segregation for fertile and sterile phenotypes thus confirming the presence of only fertile MS5 gene in non-GMS parents. A representative set of 90 confirmed fertile lines from the same population was analysed with ms5 SNP markers separately. The marker-trait association was more than 99% and marker was able to identify three classes of segregants expected for single gene (ms5) segregating F2 population. Such type of inbred lines where ms6 favourable alleles are already fixed can be easily used to develop new GMS lines by selecting only for using ms5-SNP markers thus reducing additional resources since only ms5 selection is required in MAS. Genetic analysis experiment summary is provided in Table 3.
Table 3: Genetic analysis of segregating F2 population for ms5
S.No. F2 population Total plants Fertile plants Sterile plants
1 RGF2-1 338 259 79
2 RGF2-2 409 312 97
Example 4
Markers prediction accuracy testing in breeding program
In order to validate and identify prediction accuracy of SNP markers, we have used 67 diverse populations from breeding cycles consisting of multiple GMS sib-population’s, GMS derived fertile x sterile breeding populations at various filial stages (F2, F3, F4, F5 etc.) selected for either ms5 or ms6 maintenance, and different stages of backcross populations from GMS trait introgression program. SNP marker alleles of ms5 and ms6 loci were compared against observed field phenotypes and overall prediction accuracy determined based on deviations between field phenotype and markers predicted phenotypic data. We found prediction accuracy of >99% across genetic backgrounds with sample number of 1474 total plants. Breeding generations and total number of plants tested for marker prediction accuracy summarized in Table 4 and Table 5.
Table 4:
Breeding cycle generation Total number of plants tested
F2 365
F3 54
F4 222
F5 94
BC2F1 30
BC2F2 38
BC2F5 144
BC3F1 14
BC3F2 26
BC3F3 15
BC4F1 15
BC5F1 39
BC5F2 21
Diverse GMS-inbreds 16 397
TOTAL 1474
Table 5 showing data for sterile and fertile plants from a population of plant , with detection of ms5 and ms6 sterile marker alleles
ms5 configuration ms6 configuration Prediction accuracy
Fertile plants 1187 CC or heterozygous state TT or heterozygous state >99%
Sterile plants 317 TT CC >99%
Total plants (fertile plus sterile) 1504
Example 5
Marker prediction accuracy in breeding material segregating for GMS
We also validated marker prediction capabilities of discriminating all Mendelian classes expected in F2 or advanced segregating filial generations or backcross segregation populations, SNP markers were able to clearly discriminate all classes any segregating generation and with clear association with field phenotypes. Marker were able to clearly identify sterile plants at seedling stage with >99% prediction accuracy across genetic backgrounds and seasons. Summary of results are provided in Table 4.
Example 6
Marker capabilities to discriminates off types in GMS breeding material
We tested markers prediction accuracy to identify off types (Unexpected allele configurations in GMS breeding program which are not desirable) in terms of unexpected homozygous or heterozygous favourable or unfavourable plants in GMS derived genetic populations, GMS inbreds and back cross trait introgression populations. We found >99% accuracy for off types detections confirmed based on field phenotype confirmation. Off type detection accuracies are provided in Table 6.
TABLE 6
Prediction accuracy of marker based off-types genetic configuration identification
Population type No of off-types detected by markers off-type field phenotype Prediction accuracy
GMS inbred sib populations 26 Fertile >99%
Breeding crosses-F2>F5 cycles 21 Fertile >99%
GMS BC populations 17 Fertile/loss of ms5ms6 genes >99%
Example 7
Markers deployment for GMS in cotton breeding
GMS markers deployment in breeding programs requires an extensive validation across breeding testing pipeline since the process of GMS line development to hybrid production deployment are interlinked with each other. GMS breeding generally consists line development through pedigree breeding, GMS-based hybrid seed production, GMS-line conversions through MAS/backcrossing, and marker prediction accuracy is very critical for successful deployment of GMS markers in breeding. Since the objectives of GMS breeding flow vary depending on the final outcome product expected, discrimination capabilities of SNP markers is essential with respect to both genotypic and phenotypic configuration. For example, genetic constitution of fertile and sterile plants depends on the combination of type of GMS line used and corresponding wild type fertile parent.
We have extensively validated GMS markers across breeding cycles with > 99% phenotype prediction accuracy of sterile and fertile plants over seasons.
REFERENCES:
1. Ayalew H, Tsang PW, Chu C, Wang J, Liu S, Chen C, et al. (2019) Comparison of TaqMan, KASP and rhAmp SNP genotyping platforms in hexaploid wheat. PLoS ONE 14(5): e0217222. https://doi.org/10.1371/journal.pone.0217222)
2. Broccanello, C., Chiodi, C., Funk, A., McGrath, J. M., Panella, L., & Stevanato, P. (2018). Comparison of three PCR-based assays for SNP genotyping in plants. Plant Methods, 14, 28. doi:10.1186/s13007-018-0295-6
3. Chen, D., Ding, Y., Guo, W., & Zhang, T. (2009). Molecular mapping of genic male-sterile genes ms15, ms5 and ms6 in tetraploid cotton. Plant Breeding, 128, 193-198.
4. Chen, Z.J., Sreedasyam, A., Ando, A. et al. (2020). Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat Genet 52, 525–533. https://doi.org/10.1038/s41588-020-0614-5
5. Feng, X., Keim, D., Wanjugi, H., Coulibaly, I., Fu, Y., Schwarz, J., Huesgen, S., & Cho, S. (2015). Development of molecular markers for genetic male sterility in Gossypium hirsutum. Molecular breeding : new strategies in plant improvement, 35(6), 141. https://doi.org/10.1007/s11032-015-0336-z
6. Mehetre, S. (2015). Constraints of hybrid seed production in upland and cultivated diploid cottons: Will different male sterility systems rescue? A review. J. Cotton Res. Dev. 29 : 181-211
7. Risch N. (1992). Genetic linkage: interpreting lod scores. Science (New York, N.Y.), 255(5046), 803–804. https://doi.org/10.1126/science.1536004
8. Sansaloni, C., Petroli, C., Jaccoud, D. et al. (2011). Diversity Arrays Technology (DArT) and next-generation sequencing combined: genome-wide, high throughput, highly informative genotyping for molecular breeding of Eucalyptus. BMC Proc 5 (Suppl 7), P54 (2011). https://doi.org/10.1186/1753-6561-5-S7-P54
9. Singh et al, “Male Sterility in Cotton” ; Technical bulletin-24; Central instt for cotton research
10. Chinese patent application CN107338298A (SNP molecular markers relevant to lint index of upland cotton and applications of SNP molecular markers);
11. Chinese patent application CN107338304A (SNP molecular marker related to cotton weight per boll of upland cotton and application thereof);
12. Chinese patent application CN111778354A (Molecular marker closely linked with photosensitive male-sterile character of cotton PSM4 and molecular identification method).
, Claims:1. A SNP marker for identifying and selecting a genetically male sterile cotton plant or germplasm, wherein the SNP marker is an ms5 or ms6 sterile SNP marker allele at a SNP marker locus, and wherein the ms5 sterile SNP marker allele is : SNP1 (TRLCTMS-582) comprising a substitution of "T" at position 63 of SEQ ID NO: 1 in place of “C” ; and wherein the ms6 sterile marker allele is selected from the group consisting of: SNP2 (TRLCTMS-2) comprising a substitution of "C" at position 101 of SEQ ID NO: 3 in place of “T”, and SNP3 comprising a substitution of “C” at position 101 of SEQ ID NO: 5 instead of “T” (TRLCTMS-3 ), and wherein the ms5 sterile marker allele and at least one of the ms6 sterile marker alleles is present in double homozygous state in the genetically male sterile cotton plant.
2. A method of identifying a genetically male sterile cotton plant or germplasm , the method comprising the steps of :
detecting in a cotton plant or germplasm the presence of at least one ms5 sterile SNP marker allele and at least one ms6 sterile marker allele as claimed in claim 1, wherein both the at least one ms5 and the at least one ms6 sterile alleles are present in homozygous state and wherein the double ms5ms6 homozygous state is linked to the male sterile phenotype of the cotton plant or germplasm.
3. The method as claimed in claim 2, wherein the method further comprises identifying a fertile maintainer or double heterozygous fertile cotton plant or germplasm, the method comprising the step of:
a.detecting in a cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state and the at least one ms6 sterile marker allele in heterozygous state to identify the double heterozygous plant or germplasm for the at least one ms5 and the at least one ms6 sterile alleles, wherein the double heterozygous plant or germplasm exhibits fertile phenotype; or
b. detecting in the cotton plant or germplasm the presence of the at least one ms5 sterile SNP marker allele in heterozygous state or the at least one ms6 sterile marker allele in heterozygous state to identify the single heterozygous plant or germplasm for the at least ms5 and the ms6 sterile marker alleles, wherein the single heterozygous plant or germplasm exhibits fertile phenotype and has the maintainer plant genotype.
4. The method as claimed in claim 2, wherein the method comprises the steps of:
a. obtaining DNA from the cotton plant or germplasm;
b. analysing the DNA from step (a) for presence ms5 and ms6 sterile SNP marker alleles.
5. The method as claimed in claim 2, wherein the single heterozygous plant or germplasm is identified as maintainer plant for maintaining genetically male sterile cotton plant lines.
6.The method of identifying a male sterile cotton plant or germplasm that displays genetic male sterility, the method comprising the step of:
detecting in the germplasm of the cotton plant comprising at least one ms6 male sterile marker, wherein the at least one ms6 marker is located within a chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8.
7. The method as claimed in claim 6, wherein the method further comprises the steps of:
a. isolating DNA from the cotton plant or germplasm; and
b. analyzing the isolated DNA for the presence of at least one ms6 sterile marker allele in the chromosomal interval comprising and flanked by SEQ ID NO:7 and SEQ ID NO:8.
8. The method as claimed in claim 7, wherein the one or ms6 sterile marker alleles are selected from the group consisting of SNP2, and SNP3.
9. A method of identifying a male sterile, maintainer fertile, or double heterozygous fertile cotton plant or germplasm from a cotton plant or germplasm population produced by crossing a male sterile cotton plant or germplasm with a fertile cotton plant or germplasm, the method comprising the steps of:
a. Crossing a genetically male sterile plant or germplasm which is homozygous for at least one ms5 sterile marker allele SNP1 and at least one ms6 male sterile allele selected from the group consisting of SNP2 and SNP3, as claimed in claim 1, with a second recurrent fertile parent cotton plant or germplasm to obtain a F1 plant and a segregating progeny F2 plant or germplasm population by selfing of F1 plant;
b. Selecting a F2 progeny plant from the segregating progeny F2 plant population from step (a) with:
i. both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F2 progeny plant or germplasm exhibits male sterile phenotype; or
ii. both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F2 plant or germplasm exhibits fertile phenotype; or
iii. either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F2 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm.
c. Selecting F3 progeny plants or germplasm from the segregating progeny F3 plant population or germplasm from step (b) with:
i. both the at least one ms5 sterile allele and at least one ms6 sterile allele in homozygous state, wherein the F3 progeny plant or germplasm exhibits male sterile phenotype; or
ii. both the at least one ms5 sterile maker allele and the at least one ms6 sterile marker allele in heterozygous state, wherein the double heterozygous F3 plant or germplasm exhibits fertile phenotype; or
iii. either the at least ms5 sterile marker allele or the at least one ms6 sterile marker allele in heterozygous state, wherein the single heterozygous F3 plant or germplasm exhibits fertile phenotype and is a maintainer plant or germplasm.
d. repeating selection steps b and c “n” number of times whenever maintainer plants or double heterozygous plants or germplasm are selected, wherein n is 2 to 8 more filial generations, and wherein the single heterozygous alleles and the double heterozygous plants or germplasm are selected; and
e. selecting a sib-mating progeny plant or germplasm population derived from sib-crossing a near-isogenic fertile maintainer plant or germplasm which is heterozygous either at the at least one ms5 or the at least one ms6 sterile marker allele with a sterile plant or germplasm homozygous at both the at least one ms5 and the at least one ms6 marker alleles.
10. The method as claimed in claim 9, wherein the second parent plant or germplasm is a recurrent parent containing unfavourable alleles at ms5 and ms6, and the method further comprises the steps :
a. backcrossing the F1 plant or germplasm obtained in step (a) of claim 9, with the recurrent parent cotton plant to get BC1F1 progeny plant population or germplasm; and
b. selecting a progeny plant or germplasm from the segregating BC1F1 progeny plant or germplasm population with both the at least one ms5 and the at least one ms6 sterile marker alleles in heterozygous state, and backcrossing the selected progeny plant or germplasm with the recurrent parent plant or germplasm to produce BC2F1, wherein the double heterozygous plant or germplasm is fertile;
c. Repeating step (a)- (b) “n” number of times, wherein “n” is 2 to 5 or more, to obtain BCnF1 progeny plant, followed by selfing of BCnF1 plants to get BCnF2 segregating progeny plant or germplasm; and
d. Selecting BCnF2 near-isogenic recurrent plant type progeny plant or germplasm with maintainer plant that is heterozygous for either the at least one ms5 or the at least one ms6 sterile marker allele, and sib-mating a near isogenic maintainer fertile plant with a sterile plant within family to produce a male sterile plant or germplasm with background genotype of the recurrent parent plant or germplasm.
| # | Name | Date |
|---|---|---|
| 1 | 202321022922-STATEMENT OF UNDERTAKING (FORM 3) [29-03-2023(online)].pdf | 2023-03-29 |
| 2 | 202321022922-Sequence Listing in txt [29-03-2023(online)].txt | 2023-03-29 |
| 3 | 202321022922-Sequence Listing in PDF [29-03-2023(online)].pdf | 2023-03-29 |
| 4 | 202321022922-FORM 1 [29-03-2023(online)].pdf | 2023-03-29 |
| 5 | 202321022922-DECLARATION OF INVENTORSHIP (FORM 5) [29-03-2023(online)].pdf | 2023-03-29 |
| 6 | 202321022922-COMPLETE SPECIFICATION [29-03-2023(online)].pdf | 2023-03-29 |
| 7 | 202321022922-FORM-26 [29-05-2023(online)].pdf | 2023-05-29 |
| 8 | 202321022922-FORM-9 [12-04-2024(online)].pdf | 2024-04-12 |
| 9 | 202321022922-FORM 18 [12-04-2024(online)].pdf | 2024-04-12 |