Systems And Methods For Weather Based Pollen Monitoring

< Back

Systems And Methods For Weather Based Pollen Monitoring

Abstract: ABSTRACT SYSTEMS AND METHODS FOR WEATHER-BASED POLLEN MONITORING A system for weather-based pollen monitoring, comprising: weather sensors to sense weather conditions to obtain ‘weather data’; location module to sense location for the weather sensors to tag said sensed weather conditions with said location to obtain ‘location-tagged sensed weather data; clock module to tag location-tagged sensed weather data with a date stamp to obtain ‘date-location-tagged sensed weather data; a microprocessor-based pollen processing unit configured to: determine phenological parameters; apply machine learning algorithms to estimate pollen counts for trees, grasses, and weeds using said ‘weather data’, said ‘location-tagged sensed weather data’, said ‘date-location-tagged sensed weather data’, and said phenological parameters; and output the estimated pollen counts. [[FIGURE 1]]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

19 September 2023

Publication Number

12/2025

Publication Type

INA

Invention Field

PHYSICS

Status

Parent Application

Applicants

DATAIR TECHNOLOGY PRIVATE LIMITED

Cobalt Building 19, Church Street, Shanthala Nagar, Bengaluru, 560001, Karnataka, India

Inventors

1. PAREEKSHITH U S KATTI

DATAIR TECHNOLOGY PRIVATE LIMITED, Cobalt Building 19, Church Street, Shanthala Nagar, Bengaluru, 560001, Karnataka, India

2. ABHILASH MISHRA

DATAIR TECHNOLOGY PRIVATE LIMITED, Cobalt Building 19, Church Street, Shanthala Nagar, Bengaluru, 560001, Karnataka, India

3. N NITHIN SRIVATSAV

DATAIR TECHNOLOGY PRIVATE LIMITED, Cobalt Building 19, Church Street, Shanthala Nagar, Bengaluru, 560001, Karnataka, India

4. SHAFEEK BIN ASHARAF

DATAIR TECHNOLOGY PRIVATE LIMITED, Cobalt Building 19, Church Street, Shanthala Nagar, Bengaluru, 560001, Karnataka, India

Specification

DESC:FIELD OF THE INVENTION:
This invention relates to the field of environmental engineering and sensors.

Particularly, this invention relates to monitoring systems.

Specifically, this invention relates to systems and methods for weather-based pollen monitoring.

BACKGROUND OF THE INVENTION:
Pollen is a fine powder produced by trees and plants for reproduction. Pollen can trigger allergies for people who are sensitive to pollen. Counting pollen levels is an important step to understand behavior patterns of pollen and seasonality associated with it. Presently, pollen monitoring is a labour intensive and costly process. Most of the pollen counting is done manually and needs experts, expensive equipment to collect samples and microscopic analysis to count pollen. This has hindered the coverage of pollen stations across the world. Automated approaches have been invented in recent times which use computer vision-based approaches to sample and analyze pollen.

However, these sensors are very expensive and are made of expensive equipment which also hinders coverage of pollen stations across the world.

OBJECTS OF THE INVENTION:
An object of the invention is to provide to provide systems and methods for monitoring weather-based pollen status without using expensive and extensive equipment or its network thereof.

Another object of the invention is to provide to provide systems and methods for monitoring weather-based pollen status over a wider coverage area.

Yet another object of the invention is to provide to provide scalable systems and methods for monitoring weather-based pollen status over a wider coverage area.

SUMMARY OF THE INVENTION:
According to this invention, there is provided a system for weather-based pollen monitoring, said system comprising:
- an electronic board consisting, essentially, of:
o one or more weather sensors configured to sense weather conditions in order to obtain ‘weather data’;
o a location module configured to sense location for each of the one or more weather sensors in order to tag said sensed weather conditions with said location in order to obtain ‘location-tagged sensed weather data;
o a clock module configured to tag location-tagged sensed weather data with a date stamp in order to obtain ‘date-location-tagged sensed weather data;
o a microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to:
? receive data from said one or more weather sensors and said location module;
? determine phenological parameters based on the geographic location, determined from said location module and the time of year from said clock module;
? apply machine learning algorithms to estimate pollen counts for trees, grasses, and weeds using said ‘weather data’, said ‘location-tagged sensed weather data’, said ‘date-location-tagged sensed weather data’, and said phenological parameters;
? output the estimated pollen counts.
an Internet-of-Things (IOT) gateway consisting, essentially, of:
o a communication module configured to establish communication channels, and communicate, data to data servers;

In at least an embodiment, said system comprising:
o a first memory, associated with said microprocessor-based pollen processing unit, storing a set of instructions for determining phenological parameters based on geographic location and time of year;
o a second memory, associated with said microprocessor-based pollen processing unit, storing a set of instructions for estimating pollen counts using machine learning models based on environmental conditions, geographic location, and phenological parameters;
? said a communication module configured to establish communication channels, and communicate, data to data servers --- from first memory to second memory; and
? said microprocessor-based pollen processing unit configured to execute the instructions stored in the first and second memories to generate estimated pollen counts for trees, grasses, and weeds.

In at least an embodiment, said system comprising:
a. a training data augmenter configured to increase the training data size by generating random samples based on probability distributions for each month;
b. a residual dataset generator configured to create multiple datasets based on the chosen distribution for each month and calculate a residual dataset by subtracting the mean residual data from the actual data; and
c. an ensemble model trainer configured to train an ensemble of base models using both augmented training data and residual datasets, and to combine predictions from these models.

In at least an embodiment, said sensors are selected from a group of sensors consisting of temperature sensors, relative humidity sensors, wind sensors, and rain sensors.

In at least an embodiment, said system comprising a phenological comparator configured to identify ‘phenological’ parameters, from a database of parameters, in order to flag phenological parameters correlative to identified regions based on comparison with identified ‘phenological’ parameters.

In at least an embodiment, said ‘phenological’ parameters being selected from a group of ‘phenological’ parameters consisting of:
a. a list of categories in their peak season,
b. a list of categories in their starting season,
c. a list of categories in their ending season,
d. average historical tree pollen count,
e. average historical grass pollen count, and
f. average historical tree pollen count.

In at least an embodiment, said microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to derive phenological parameters comprising the steps of:
- establishing a phenological database with fields correlative to geolocation, time of year, and corresponding data relative to trees phenological parameter, grass phenological parameter, weed phenological parameter;
- determining geolocation;
- determining time of year;
- determining bounding boxes for that geolocation;
- determining season correlative to determined geolocation, determined time of year, and correspondingly determining bounding box;
- querying said phenological database
- assigning relevant phenological data, selectable from peak data, starting data, ending data, for determined bounding box and determined time of year;
- assigning phenological parameters, selectable from trees phenological parameter, grass phenological parameter, and weed phenological parameter based on assigned relevant phenological data; and
- outputting phenological parameters correlative to phenological data for use in pollen count prediction models.

According to this invention, there is provided a method for weather-based pollen monitoring, using a microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to training a pollen prediction model, comprising the steps of:
a collecting historical data, including pollen counts, phenological parameters, and weather parameters;
b. computing average monthly mean (µm) and variance (sm²) of the historical pollen counts;
c. selecting a probability distribution that best fits the pollen count data, for each month, based on the computed average monthly mean (µm) and variance (sm²);
d. generating probability distributions monthly, encoding seasonal variations by calculating parameters and fitting various distributions including Beta Distribution and Gamma Distribution;
d. generating synthetic pollen count data using the fitted distribution from the selected probability distribution;
e. augmenting the training data with the generated synthetic pollen count data to augment the training data, and generating augmented training data, by filling in gaps and enhancing the dataset's completeness by:
i. generating n synthetic data points using the Normal Distribution with parameter mean (µ) and parameter variance (s2)’
ii. generating n synthetic data points using the Beta Distribution with parameters shape parameter (a) and shape parameter (ß);
iii. generating n synthetic data points using the Gamma Distribution with parameters shape parameter (k) and scale parameter (?);
iv. computing Mean Squared Error (MSE) for each distribution;
v. choosing the distribution with the least Mean Squared Error (MSE);
f. filling gaps of missing data points, in said historical data, by sampling from said probability distribution in order to produce synthetic pollen count values that align with historical data trends;
g. training an ensemble of machine learning models using the synthetic pollen count data to provide ensemble models;
h. training a stack of machine learning models using the augmented training data to provide stacking models; and
i. stacking predictions from the ensemble of machine learning models to produce a final pollen count prediction from the ensemble models and the stacking models.

In at least an embodiment, said step of training an ensemble of machine learning models using the augmented training data to provide ensemble models comprising the steps of:
a. generating random samples for each month based on probability distributions derived from historical pollen data;
b. replacing existing pollen count data with the generated random samples; and
c. augmenting the training dataset by adding the generated random samples to the existing training data until a predefined maximum training size is reached.

In at least an embodiment, said step of training an ensemble of machine learning models using the augmented training data to provide residual data sets comprising the steps of:
a. generating multiple datasets for each month based on a selected probability distribution;
b. calculating a mean residual dataset by averaging the generated datasets; and
c. creating a residual dataset by subtracting the mean residual dataset from the actual historical pollen count data.

In at least an embodiment, said step of training an ensemble of machine learning models using the augmented training data to provide residual data sets comprising the steps of:
a. selecting a diverse set of base machine learning models;
b. training a first set of base models using augmented training data;
c. training a second set of base models using the residual dataset; and
d. combining the predictions from the first and second sets of base models to generate ensemble predictions.

In at least an embodiment, said step of training a stack of machine learning models using the augmented training data to provide stacking models comprising the steps of:
- configuring a two-layer structure, in that,
o a first layer configured to generate predictions; and
o a second layer with a meta-learner configured to be trained with generated predictions from said first layer and using synthetic pollen count data.

BRIED DESCRIPTION OF THE ACCOMPANYING DRAWINGS:
This invention will now be described in relation to the accompanying drawings, in which:
FIGURE 1 illustrates an architecture of the system of this invention;
FIGURE 2 illustrates a schematic block diagram of the system of this invention; and
FIGURE 3 illustrates a schematic flowchart for the method of this invention.

DETAILED DESCRIPTION OF THE ACCOMPANYING DRAWINGS:
According to this invention, there are provided systems and methods for weather-based pollen monitoring. The present invention is a pollen station comprising weather sensors, GPS module, Pollen processing unit, and IOT Gateway that determines pollen counts by combining on ground weather sensor data and location with algorithms and machine learning to calculate pollen particles present. The station is relatively cheap to build and does not require expensive equipment. The station processes the pollen counts on the edge and entire processing is done inside a device. This ensures that scalability is really easy and cost effective.

The system and method, of this invention, is a weather sensor-based pollen station capable of combining weather data from sensor/s, GPS data from a GPS module, time of year from a GPS module; all fed to a microprocessor based “Pollen Processing Unit” capable of running machine learning models and algorithms in order to estimate pollen count for given weather parameters, geography, and phenological parameters derived from time of year. The system and method, of this invention, processes pollen counts on the edge and entire processing is done inside the system.

In at least an embodiment, the system comprises two parts:
1) electronic board;
2) Internet-of-Things (IOT) Gateway.

FIGURE 1 illustrates an architecture of the system of this invention.
FIGURE 2 illustrates a schematic block diagram of the system of this invention.

In at least an embodiment, the electronic board comprises one or more weather sensors configured to sense weather conditions, GPS module configured to sense location, and a microprocessor-based pollen processing unit. The electronic board consists of a clock, weather sensors, GPS module, and a Pollen Processing Unit.
In preferred embodiments, the sensors are selected from a group of sensors consisting of temperature sensors, relative humidity sensors, wind sensors, rain sensors, and the like sensors.
In preferred embodiments, the microprocessor-based pollen processing unit uses geographical data processed from the GPS module and, then, uses time to derive phenological parameters for different types of pollen which includes start of the season, peak of the season, end of the season, if the timestamp is in season and position of current timestamp with respect to the season using an algorithm which searches a database using geographical region and time of the year. The phenological parameters are then passed to multiple machine learning models inside the microprocessor-based pollen processing unit along with the sensed weather parameters from the electronic board. The machine learning models are trained using historical trends and patterns observed from multiple public pollen datasets. The machine learning models process input data in order to give tree pollen counts, grass pollen counts, and weed pollen counts.
In at least an embodiment, electronic board comprises a data acquisition module using at least an advanced laser-based rain sensor, an ultrasonic type wind sensor, a MEMS-based temperature sensor probe, and a GPS module with antenna; all of which are integrated to a wireless sensor node.

In at least an embodiment, the Internet-of-Things (IOT) Gateway comprises at least a communication module configured to establish communication channels, and communicate, data to data servers.

In at least an embodiment, the microprocessor-based pollen processing unit comprises a first memory with a set of instructions, in order to identify geography and obtain phenology, the set of instructions are explained below:
In at least an embodiment of the microprocessor-based pollen processing unit with the first memory, there is provided a location polling module configured to poll location data in terms of latitude data and longitude data from the GPS module of the electronic board.
In at least an embodiment of the microprocessor-based pollen processing unit with the first memory, there is provided a calendar-clock configured to determine ‘current month’ correlative to data from a database in order to identify ‘season’ data (e.g., spring season, summer season, fall season, winter season).
In at least an embodiment of the microprocessor-based pollen processing unit with the first memory, there is provided a bounding box comparator configured to identify ‘region’ data from a database of regions in order to flag supported regions correlative to identified poll location data based on comparison with identified poll location data. These regions represent areas where phenological data is available. The algorithm then goes through each region in the list of supported regions. For each region, it checks if the given latitude and longitude coordinates fall within the bounding box of that region. If the coordinates are within a region, that region is considered a potential match.
In at least an embodiment of the microprocessor-based pollen processing unit with the first memory, there is provided a phenological comparator configured to identify ‘phenological’ parameters, from a database of parameters, in order to flag phenological parameters correlative to identified regions based on comparison with identified ‘phenological’ parameters. Typically, the phenological parameters comprise:
a. “peak categories” - a list of categories in their peak season,
b. “starting categories” - a list of categories in their starting season,
c. “ending categories” - a list of categories in their ending season,
d. “avg_tree” - average historical tree pollen count,
e. “avg_grass” - average historical grass pollen count,
f. “avg_weed” - average historical tree pollen count

In at least an embodiment, the parameter “peak categories” lists the plant categories (such as specific trees, grasses, or weeds) that are in their peak season at a given time; during the peak season, the pollen count for these categories is typically at its highest. Knowing which categories are in their peak can help in accurately predicting high pollen levels.
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided a peak categories determinator configured to identify ‘peak categories’ data, from a database of peak seasons, in order to output a peak category (such as tree type/s, plant species, and the like) that are in peak season during the current month. The peak categories determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average tree count.

In at least an embodiment, the parameter “starting categories” identifies the plant categories that are just beginning their season; understanding which plants are starting their season allows the system to anticipate an increase in pollen counts from these sources as the season progresses.
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided a starting categories determinator configured to identify ‘starting categories’ data, from a database of peak seasons, in order to output a category from a list of categories that just starting their season during the current identified month. The starting categories determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average tree count.

In at least an embodiment, the parameter “ending categories” identifies the plant categories that are at the end of their season; pollen levels from these categories may start to decline, and knowing this helps in adjusting predictions for a decrease in their contribution to overall pollen counts
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided an ending categories determinator configured to identify ‘ending categories’ data, from a database of peak seasons, in order to output from a list of categories that just ending their season during the current identified month. The ending categories determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average tree count.

Peak Categories, Starting Categories, Ending Categories:
These categories represent the seasonal behaviour of pollen types (tree, grass, weed). The peak categories indicate when a certain pollen type is at its highest emission during the season, while starting and ending categories represent the transition phases of pollen seasonality.
The seasonal categories are derived based on the region and current month using phenological data, which correlates regional climate patterns with pollen seasonality. The rules are as follows:
• If a pollen type (e.g., "tree") is in the peak categories, this means the pollen count is expected to be at its highest.
• If a pollen type is in the starting categories, this indicates an increasing trend in pollen count.
• If a pollen type is in the ending categories, it implies the pollen count is decreasing.

In at least an embodiment, the parameter “Avg_Tree” represents the average historical pollen count for trees in a given region and season; this data serves as a baseline to compare current pollen levels, helping the system predict whether the current season’s pollen count is higher or lower than average
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided an average tree determinator configured to identify ‘average tree count’ data, from a database of peak seasons, in order to output historical average pollen counts for different categories (e.g., tree pollen, grass pollen, weed pollen) in the region for that current identified month. The average tree determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average tree count.
Pollen Calculation for Trees:
o If the type "tree" is present in any of the categories ("peak categories," "starting categories," "ending categories") obtained from the phenological parameters:
? The algorithm uses the machine learning model "m_tree" that has been trained with inputs ("temp," "hum," "w_speed," "w_dir," "rain_intensity," "avg_tree") (i.e. current temperature, current humidity, current wind speed, current wind direction, intensity of rainfall, historical average tree pollen count), and corresponding tree pollen counts.
? The model "m_tree" predicts the tree pollen count based on the provided environmental parameters and historical tree pollen data.
o If "tree" is absent in any categories, the tree pollen count is set to 0.

In at least an embodiment, the parameter “Avg_Grass” lists the average historical pollen count for grasses; similar to Avg_Tree, this baseline helps in assessing current grass pollen levels and predicting trends based on weather and other factors.
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided an average grass determinator configured to identify ‘average grass count’ data, from a database of peak seasons, in order to output historical average grass counts for different categories (e.g., tree pollen, grass pollen, weed pollen) in the region for that current identified month. The average grass determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average grass count.
Pollen Calculation for Grass:
o If the type "grass" is present in any of the categories:
? The algorithm uses the machine learning model "m_grass" trained with inputs (i.e. current temperature, current humidity, current wind speed, current wind direction, intensity of rainfall, historical average tree pollen count) similar to the previous step and corresponding historical grass pollen counts.
? The model "m_grass" predicts the grass pollen count.
o If "grass" is absent in any categories, the grass pollen count is set to 0.

In at least an embodiment, the parameter “Avg_Weed” represents the average historical pollen count for weeds; this parameter helps in understanding and predicting weed pollen levels, which can be crucial during allergy season when weed pollen is a common trigger
In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided an average weed determinator configured to identify ‘average weed count’ data, from a database of peak seasons, in order to output historical average weed counts for different categories (e.g., tree pollen, grass pollen, weed pollen) in the region for that current identified month. The average weed determinator is trained or a machine learning model trained with parameters including temperature, wind speed, wind direction, rain intensity, and average weed count.
Pollen Calculation for Weeds:
o If the type "weed" is present in any of the categories:
? The algorithm uses the machine learning model "m_weed" trained with inputs (i.e. current temperature, current humidity, current wind speed, current wind direction, intensity of rainfall, historical average tree pollen count) similar to the previous steps and corresponding historical weed pollen counts.
? The model "m_weed" predicts the weed pollen count.
o If "weed" is absent in any categories, the weed pollen count is set to 0.

Typically, the phenological parameters are used by the system’s machine learning models to predict current pollen counts. By knowing what plants are at their peak, starting, or ending their seasons, and comparing this information with historical averages, the system can make accurate predictions about the amount and type of pollen present in the air at any given time. This makes the current invention a powerful tool for monitoring and forecasting pollen levels, which is crucial for allergy sufferers and environmental monitoring.

These parameters represent the average historical pollen counts for tree, grass, and weed pollen, respectively, over time (typically over a period of years) for the given region.
The machine learning models use these averages as baseline inputs to estimate pollen levels for the current conditions. These average pollen counts act as "historical references" in the models for calculating the current pollen count based on weather parameters like temperature, humidity, wind speed, etc.

Logic Flow:
• Phenological Parameters: The phenology database uses regional and seasonal data to derive the list of peak, starting, and ending categories based on the current month and region. This is a rule-based correlation between time, location, and pollen type, governed by historical and environmental data.

Pollen Calculation:
• The algorithm checks the phenological categories (peak, starting, ending) to determine if tree, grass, or weed pollen are currently active in the region. This is a binary check based on the category list derived from the phenology database.
• If a pollen type (e.g., tree) is in the peak, starting, or ending categories, the machine learning model for that pollen type is triggered, and the following inputs are passed: weather data (temperature, humidity, wind speed, wind direction, rain intensity) and historical average pollen count for the type (e.g., avg_tree for tree pollen).

Machine Learning Model:
• Each pollen type (tree, grass, weed) has its own machine learning model that takes weather data and historical averages as inputs.
• The output of the model is the predicted pollen count for that pollen type based on the current conditions and the seasonal behavior (peak, starting, or ending).
• The model correlates the average historical count (e.g., avg_tree) with current weather conditions to estimate real-time pollen counts. For example, the historical data helps adjust the impact of current weather on pollen emissions, like how temperature and wind speed may influence pollen spread during peak or ending phases.

The peak, starting, and ending categories are derived from phenological rules based on region and season. They determine whether a particular pollen type is active.

avg_tree, avg_grass, and avg_weed are historical data points used to establish a baseline in the machine learning models, which correlate these with current environmental factors to estimate real-time pollen counts.

Logic for m_tree Model Prediction:
i) Input Parameters:
o The inputs to the m_tree model include:
? Temperature (temp): A continuous value, typically influenced by seasonal conditions.
? Humidity (hum): A percentage value, where different humidity levels have varying effects on pollen dispersal.
? Wind Speed (w_speed): Measured in velocity, it influences how widely pollen can spread.
? Wind Direction (w_dir): Different wind directions might transport pollen from different areas.
? Rain Intensity (rain_intensity): A measure of rainfall, where heavier rain typically suppresses pollen in the air.
? Average Historical Tree Pollen Count (avg_tree): Historical average pollen count for tree pollen in the given region, serving as a baseline.
ii) Model Training and Prediction:
o Training Data: The model is trained using historical data, which includes weather conditions and corresponding pollen counts. It learns the relationships between various weather parameters and pollen levels.
o Correlation and Conditions:
? The model identifies patterns between the environmental factors and pollen counts during the training phase. For example:
? Temperature: Certain ranges of temperature may lead to increased pollen release, while extremes in either direction may reduce pollen dispersal.
? Humidity: High or low humidity levels have different impacts on pollen spread, which are learned by the model.
? Wind Speed: The model accounts for how different wind speeds affect pollen movement, with higher wind generally leading to more widespread pollen dispersal.
? Rain Intensity: The model adjusts predictions based on rain intensity, as heavier rainfall typically leads to pollen being washed out of the air.
iii) Prediction Process:
o Based on the input data, the model compares the current environmental conditions to the patterns learned from the historical data. It adjusts the prediction based on how each parameter is affecting pollen dispersal at that time.
o For example, certain combinations of temperature, wind, and humidity might indicate higher pollen activity, while rain may reduce it.
iv) Output:
o The output is the predicted tree pollen count. The model will predict higher or lower counts based on how favorable the current weather conditions are for pollen dispersal, as learned during its training process.

Summary of Logic Flow:
• Step 1: Receive input parameters (temp, humidity, wind speed, wind direction, rain intensity, avg_tree).
• Step 2: The model processes these inputs, comparing them to the conditions it has learned from historical data.
• Step 3: It adjusts the predictions based on how these parameters correlate with pollen levels (for example, favorable temperature and wind might lead to a higher pollen count, while rainfall might suppress it).
• Step 4: Output the predicted tree pollen count based on the conditions.

The machine learning model determines relationships between the parameters without predefined rules or thresholds. Instead, it learns these patterns during training based on the historical data it is provided with.

As for m_tree, similarly, for m_grass, and for m_weed, the phenological parameters decides which model needs to be used and the average count is going to be different for each category (tree, grass, and weed).

In at least an embodiment, the microprocessor-based pollen processing unit comprises a second memory with a set of instructions, in order to compute pollen data, the set of instructions are explained below:
The system and method, of this invention, takes several environmental parameters as inputs:
o "temp": Temperature
o "hum": Humidity
o "w_speed": Wind speed
o "w_dir": Wind direction
o "rain_intensity": Rain intensity
o "phenological_parameters": Phenological parameters derived from the previous algorithm.
The system and method, of this invention, uses machine learning or deep learning models to predict pollen counts for different types of plants (tree, grass, weed). These models have been trained using historical data and other relevant parameters.

In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided a location polling module configured to poll location data in terms of latitude data and longitude data from the GPS module of the electronic board.

In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided a calendar-clock configured to determine ‘current month’ correlative to data from a database in order to identify ‘season’ data.

In at least an embodiment of the microprocessor-based pollen processing unit with the second memory, there is provided a phenological comparator configured to identify ‘phenological’ parameters from a database of parameters in order to flag phenological parameters correlative to identified regions based on comparison with identified ‘phenological’ parameters. As an output, the system and method, of this invention, outputs retrieved phenological parameters for a chosen region. This information can be used to understand which plant categories are active, starting, or ending their seasons, as well as to provide historical pollen count averages for various plant types. Typically, the phenological parameters comprise:
a. “peak categories” - a list of categories in their peak season,
b. “starting categories” - a list of categories in their starting season,
c. “ending categories” - a list of categories in their ending season,
d. “avg_tree” - average historical tree pollen count,
e. “avg_grass” - average historical grass pollen count,
f. “avg_weed” - average historical tree pollen count.

The system and method, of this invention, returns, as an output, predicted pollen counts for each type of plant:
o "tree_count": Predicted tree pollen count
o "grass_count": Predicted grass pollen count
o "weed_count": Predicted weed pollen count

Logic for Deriving Phenological Parameters:

Geographical Input (Latitude and Longitude):
The process starts by using the geographical coordinates (latitude and longitude) obtained from a GPS module. These coordinates help determine the specific region where the user or station is located.

Season Identification:
The current month or date is used to identify the corresponding season. The phenological patterns, such as the growth stages of plants and pollen activity, are often strongly linked to specific seasons (spring, summer, fall, winter). Each season has a different influence on tree, grass, and weed pollination.

Region Matching:
The geographical coordinates are compared with a predefined list of supported regions, each having a bounding box (a set of latitude and longitude ranges that define its boundaries).
The algorithm identifies the correct region by checking if the input coordinates fall within the bounding box of one of these regions.

Phenology Database Lookup:
Once the region is determined, the system looks up phenological data for that region in a phenology database. The phenology database contains information about plant species and their seasonal behaviors, including when different types of plants (trees, grass, weeds) are in their peak, starting, or ending phases of pollen emission.

Phenological Parameter Assignment:
The phenological parameters are derived based on the season and region combination. These parameters typically include:
Peak Categories: Plants or pollen types (tree, grass, weed) that are in their peak emission period.
Starting Categories: Plants or pollen types that are beginning their season.
Ending Categories: Plants or pollen types that are nearing the end of their emission period.
Average Historical Pollen Counts: The average historical counts for tree, grass, and weed pollen in the given region, based on previous seasons and years.

Output Phenological Parameters:
The phenological parameters (peak, starting, ending categories, and average historical counts) are then outputted to be used by the machine learning models for further processing, such as predicting pollen levels.

STEPS of Logic Flow for deriving phenotypes:
a) Receive input: Latitude, longitude, and current month or date.
b) Identify the region: Compare latitude and longitude to predefined regional bounding boxes.
c) Determine the season: Use the current month to identify the season.
d) Query the phenology database: Retrieve relevant phenological data for the identified region and season.
e) Assign phenological parameters: Categorize plant types (trees, grass, weed) based on their seasonal phases (peak, starting, ending).
f) Return the phenological parameters: Provide the derived phenological features for use in pollen count prediction models.

The following discloses method for Pollen Model Training:
FIGURE 3 illustrates a schematic flowchart for the method of this invention.
1. Data Collection and Preprocessing [STEP 301]:
In at least an embodiment, a data collector (DC) is configured to collect historical data including pollen counts, phenological parameters (e.g., peak, starting, and ending seasons), and weather parameters (e.g., temperature, humidity, wind speed, etc.). This data is used, by a processor (P), to derive average monthly mean (µm) and variance (sm²) of pollen counts using historical pollen counts. These statistical metrics are essential for understanding the distribution of pollen counts across different months.
• Let D represent the historical pollen count dataset.
• Compute the monthly mean µm and variance sm^2 for each month m from D.

2. Probability Distribution Selection [STEP 302]:
In at least an embodiment, the processor (P) is configured such that, for each month, use the average pollen count and variance, from the data collector (DC), to create a probability distribution that best fits the pollen count data. In preferred embodiments, this distribution can take various forms of continuous distributions, such as a Gaussian (normal) distribution, beta distribution, gamma distribution, or any other suitable distribution based on the data characteristics. The parameters for each distribution type are calculated, and the distribution is fitted to the monthly data.
For each month m:
1. Calculate parameters for Normal Distribution:
o Calculate the mean µ and variance s2 for month m.
o Let Parameters for Normal Distribution be
?
o Fit a normal distribution
?

3. Creating the Probability Distribution: Encoding of seasonal patterns [STEP 303]:
In at least an embodiment, using an encoder (EN), seasonal patterns are intrinsically encoded by generating probability distributions monthly. The probability distribution (Beta distribution, Gamma distribution) represents likelihood of observing different pollen count values for a given month.
• Seasonal Pattern Encoding: An encoder (EN) generates probability distributions monthly, encoding seasonal variations. This step involves calculating parameters and fitting various distributions like Beta and Gamma to the monthly data.
2. Calculate parameters for Beta Distribution:
a. Calculate the shape parameters a and ß based on µ and s2:
i.
b. Fit a Beta distribution
i.

3. Calculate parameters for Gamma Distribution:
a. Calculate the shape parameter k and scale parameter ? based on µ and s2:
i.
b. Fit a Gamma distribution
i.
• Synthetic Data Generation: Synthetic pollen count data is generated using the fitted distributions, which helps to augment the training data by filling in gaps and enhancing the dataset's completeness.
4. Generate synthetic data Sm using the distributions:
Incorporating the synthetic pollen count values generated from the probability distribution involves adding these values as supplementary training data points alongside the observed data. This strategy improves the dataset's overall completeness and mitigates the potential influence of missing data on the model's performance.
a. Generate n synthetic data points using the Normal Distribution with parameters µ and s2.
b. Generate n synthetic data points using the Beta Distribution with parameters a and ß.
c. Generate n synthetic data points using the Gamma Distribution with parameters k and ?.

5. Calculate Mean Squared Error (MSE) for each distribution:
a. Calculate MSE for Normal Distribution
i.
b. Calculate MSE for Beta Distribution
i.
c. Calculate MSE for Gamma Distribution
i.

6. Choose the distribution with the least Mean Squared Error:
a. Find the distribution (N, B, or G) Distm with the smallest MSE value
i.

4. Gap Filling [STEP 304]:
In at least an embodiment, a predictor (PD), when encountering gaps or missing data points in the historical pollen count data for a specific month during the training phase, the algorithm can utilize the probability distribution previously generated in STEP 303. By using the distribution for that month, the algorithm samples, using a sampler (SM), from the probability distribution, of STEP 303, to produce synthetic pollen count values that align with historical data trends, of STEP 301, and variances.
• Handling Missing Data: When gaps in the historical pollen data are encountered, the predictor (PD) uses the previously generated probability distributions to sample synthetic pollen counts. These synthetic values replace the missing data, ensuring the dataset remains consistent and comprehensive.
1. Let Gap represent a gap in the training data occurring during month m.
2. Utilize the distribution Distm chosen for month m.
3. Create synthetic data GapData based on the distribution Distm, replacing the missing or incomplete data in the gap.

5. Training Data Augmentation [STEP 305]:
• Augmenting Training Data: A training data augmenter (TDA) increases the training data size by generating random samples based on the probability distributions for each month. This augmented data set improves the robustness of the machine learning model.
In at least an embodiment, the training data augmenter (TDA) is employed, with a processor, having stored instructions, to:
1. Let MaxTrainingSize be the maximum training size.
2. While TrainingSize < MaxTrainingSize:
For each month m:
generate random samples RandSamplesm and replace pollen count data.
3. Augment the training data by adding RandSamplesm
TrainingData = TrainingData + RandSamplesm

6. Residual Dataset Creation [STEP 306]:
• Residual Analysis: A residual dataset generator (RDG) creates multiple datasets based on the chosen distribution for each month. The mean residual dataset is computed, and the residual dataset is calculated by subtracting the mean residual data from the actual data, which helps in fine-tuning the model by focusing on the differences between predicted and actual values.
In at least an embodiment, the residual dataset generator (RDG) is employed, with a processor, having stored instructions, to:
1. Generate n number of datasets `ResidualData_i` for a length equal to the length of TrainingData using the chosen distribution for each month.
2. Compute the mean dataset `MeanResidualData` from `ResidualData_i`.
3. Calculate the residual dataset `Residual = D - MeanResidualData`.

7. Ensemble Model Training [STEP 307]:
• Training Ensemble Models: An ensemble model trainer (EMT) creates an ensemble of diverse base models (e.g., Random Forests, Gradient Boosting). These models are trained on both the augmented training data and the residual datasets. Predictions from these models are combined to form the ensemble predictions, which help in reducing the model's overall error.
In at least an embodiment, the ensemble model trainer (EMT) is configured for creating an ensemble of base models entails selecting a diverse set of foundational models. This collection might encompass decision trees, random forests, gradient boosting, support vector machines, and other options. After assembling this ensemble, Each of these base models is trained using the available training data.
Ensemble 1 - Random Forest:
• Train Random Forest RF1 using the augmented training data.
• Train Random Forest RF2 using the residual dataset.
• Combine predictions: Ensemble1Prediction = (RF1Prediction + (RF2Prediction + MeanResidualData))/2.
Ensemble 2 - Gradient Boosting:
• Train Gradient Boosting GB1 using the augmented training data.
• Train Gradient Boosting GB2 using the residual dataset.
• Combine predictions: Ensemble2Prediction = (GB1Prediction + (GB2Prediction + MeanResidualData))/2.
Note: Random Forest and Gradient Boosting are popular Machine Learning modeling algorithms

8. Stacking of Models [STEP 308]:
• Model Stacking: A stacking model trainer (SMT) combines predictions from the base models (from the previous step) in a two-layer structure. In the first layer, the base models generate predictions, which are then used to train a meta-learner in the second layer. This approach enhances the final prediction by leveraging the strengths of multiple models.
In at least an embodiment, the stacking model trainer (SMT) is configured.
In the model stacking approach, a two-layer structure is employed. In the first layer, the chosen base models, which encompass a diverse set including decision trees, random forests, gradient boosting, and more, generate predictions based on the provided data. These base model predictions, combined with synthetic data, are then used as input for the second layer. The second layer involves training a meta-learner (meta-model) on the predictions from both the base models and the synthetic data. The meta-learner's purpose is to amalgamate the base model predictions, thereby producing a more robust and aggregated final prediction for the ensemble model.

9. Final Prediction Generation: Combining Ensemble and Stacking [STEP 309]:
• Generating Final Predictions: A final predictor (FP) combines the outputs from the ensemble models and the stacking models to generate the final pollen count prediction. This combination mitigates errors from individual models and addresses issues like overfitting by averaging out predictions from different models.
In at least an embodiment, the final predictor (FP) is configured to concatenate outputs of ensemble model trainer (EMT) [STEP 307) and stacking model trainer (SMT) [STEP 308].
Outputs of both Ensemble and Stacking are combined to generate the final output. By Combining Ensemble, Stacking, and Synthetic Data, major issues with pollen modeling are addressed by increasing training size, avoiding overfitting (mainly due to small training sets), and minimizing errors that might occur due to bad prediction from a single base model.
• Combine predictions from both ensembles: EnsemblePredictions = (Ensemble1Prediction + Ensemble2Prediction)/2.
• Stack ensemble outputs with the generated dataset: StackedPredictions = EnsemblePredictions + GeneratedDataset.
• Average out the stacked predictions: FinalPrediction = (1 / NumPredictions) * S(StackedPredictions).

10. Online Learning and Deployment [STEP 310]:
• Model Deployment: The final model is deployed in a pollen processing unit, where it continues to learn and adapt over time (online learning). A deployment module (DM) periodically updates the training data with new predictions and retrains the model to ensure it remains accurate and relevant.
In at least an embodiment, the deployment module (DM) is employed, with a processor, having stored instructions, to:
• Deploy the model in the pollen processing unit for online learning.
• Periodically store predicted values and update training data.
• Retrain the model using the updated training data.

The TECHNICAL ADVANCEMENT, of this invention, lies in providing systems and methods for weather-based pollen monitoring which eliminates the need for expensive equipment and extensive equipment.

While this detailed description has disclosed certain specific embodiments for illustrative purposes, various modifications will be apparent to those skilled in the art which do not constitute departures from the spirit and scope of the invention as defined in the following claims, and it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
,CLAIMS:WE CLAIM,

1. A system for weather-based pollen monitoring, said system comprising:
- an electronic board consisting, essentially, of:
o one or more weather sensors configured to sense weather conditions in order to obtain ‘weather data’;
o a location module configured to sense location for each of the one or more weather sensors in order to tag said sensed weather conditions with said location in order to obtain ‘location-tagged sensed weather data;
o a clock module configured to tag location-tagged sensed weather data with a date stamp in order to obtain ‘date-location-tagged sensed weather data;
o a microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to:
? receive data from said one or more weather sensors and said location module;
? determine phenological parameters based on the geographic location, determined from said location module and the time of year from said clock module;
? apply machine learning algorithms to estimate pollen counts for trees, grasses, and weeds using said ‘weather data’, said ‘location-tagged sensed weather data’, said ‘date-location-tagged sensed weather data’, and said phenological parameters;
? output the estimated pollen counts; and
- an Internet-of-Things (IOT) gateway consisting, essentially, of:
o a communication module configured to establish communication channels, and communicate, data to data servers;

2. The system as claimed in claim 1 wherein, said system comprising:
o a first memory, associated with said microprocessor-based pollen processing unit, storing a set of instructions for determining phenological parameters based on geographic location and time of year;
o a second memory, associated with said microprocessor-based pollen processing unit, storing a set of instructions for estimating pollen counts using machine learning models based on environmental conditions, geographic location, and phenological parameters;
? said a communication module configured to establish communication channels, and communicate, data to data servers --- from first memory to second memory; and
? said microprocessor-based pollen processing unit configured to execute the instructions stored in the first and second memories to generate estimated pollen counts for trees, grasses, and weeds.

3. The system as claimed in claim 1 wherein, said system comprising:
a. a training data augmenter configured to increase the training data size by generating random samples based on probability distributions for each month;
b. a residual dataset generator configured to create multiple datasets based on the chosen distribution for each month and calculate a residual dataset by subtracting the mean residual data from the actual data; and
c. an ensemble model trainer configured to train an ensemble of base models using both augmented training data and residual datasets, and to combine predictions from these models.

4. The system as claimed in claim 1 wherein, said sensors are selected from a group of sensors consisting of temperature sensors, relative humidity sensors, wind sensors, and rain sensors.

5. The system as claimed in claim 1 wherein, said system comprising a phenological comparator configured to identify ‘phenological’ parameters, from a database of parameters, in order to flag phenological parameters correlative to identified regions based on comparison with identified ‘phenological’ parameters.

6. The system as claimed in claim 1 wherein, said ‘phenological’ parameters being selected from a group of ‘phenological’ parameters consisting of:
a. a list of categories in their peak season,
b. a list of categories in their starting season,
c. a list of categories in their ending season,
d. average historical tree pollen count,
e. average historical grass pollen count, and
f. average historical tree pollen count.

7. The system as claimed in claim 1 wherein, said microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to derive phenological parameters comprising the steps of:
- establishing a phenological database with fields correlative to geolocation, time of year, and corresponding data relative to trees phenological parameter, grass phenological parameter, weed phenological parameter;
- determining geolocation;
- determining time of year;
- determining bounding boxes for that geolocation;
- determining season correlative to determined geolocation, determined time of year, and correspondingly determining bounding box;
- querying said phenological database
- assigning relevant phenological data, selectable from peak data, starting data, ending data, for determined bounding box and determined time of year;
- assigning phenological parameters, selectable from trees phenological parameter, grass phenological parameter, and weed phenological parameter based on assigned relevant phenological data; and
- outputting phenological parameters correlative to phenological data for use in pollen count prediction models.

8. A method for weather-based pollen monitoring, using a microprocessor-based pollen processing unit consisting of a microprocessor and a memory to store instructions, said pollen processing unit configured to training a pollen prediction model, comprising the steps of:
a collecting historical data, including pollen counts, phenological parameters, and weather parameters;
b. computing average monthly mean (µm) and variance (sm²) of the historical pollen counts;
c. selecting a probability distribution that best fits the pollen count data, for each month, based on the computed average monthly mean (µm) and variance (sm²);
d. generating probability distributions monthly, encoding seasonal variations by calculating parameters and fitting various distributions including Beta Distribution and Gamma Distribution;
d. generating synthetic pollen count data using the fitted distribution from the selected probability distribution;
e. augmenting the training data with the generated synthetic pollen count data to augment the training data, and generating augmented training data, by filling in gaps and enhancing the dataset's completeness by:
ii. generating n synthetic data points using the Normal Distribution with parameter mean (µ) and parameter variance (s2)’
iii. generating n synthetic data points using the Beta Distribution with parameters shape parameter (a) and shape parameter (ß);
iv. generating n synthetic data points using the Gamma Distribution with parameters shape parameter (k) and scale parameter (?);
v. computing Mean Squared Error (MSE) for each distribution;
vi. choosing the distribution with the least Mean Squared Error (MSE);
f. filling gaps of missing data points, in said historical data, by sampling from said probability distribution in order to produce synthetic pollen count values that align with historical data trends
g. training an ensemble of machine learning models using the synthetic pollen count data to provide ensemble models;
h. training a stack of machine learning models using the augmented training data to provide stacking models; and
i. stacking predictions from the ensemble of machine learning models to produce a final pollen count prediction from the ensemble models and the stacking models.

9. The method as claimed in claim 1 wherein, said step of training an ensemble of machine learning models using the augmented training data to provide ensemble models comprising the steps of:
a. generating random samples for each month based on probability distributions derived from historical pollen data;
b. replacing existing pollen count data with the generated random samples; and
c. augmenting the training dataset by adding the generated random samples to the existing training data until a predefined maximum training size is reached.

10. The method as claimed in claim 1 wherein, said step of training an ensemble of machine learning models using the augmented training data to provide residual data sets comprising the steps of:
a. generating multiple datasets for each month based on a selected probability distribution;
b. calculating a mean residual dataset by averaging the generated datasets; and
c. creating a residual dataset by subtracting the mean residual dataset from the actual historical pollen count data.

11. The method as claimed in claim 1 wherein, said step of training an ensemble of machine learning models using the augmented training data to provide residual data sets comprising the steps of:
a. selecting a diverse set of base machine learning models;
b. training a first set of base models using augmented training data;
c. training a second set of base models using the residual dataset; and
d. combining the predictions from the first and second sets of base models to generate ensemble predictions.

12. The method as claimed in claim 6 wherein, said step of training a stack of machine learning models using the augmented training data to provide stacking models comprising the steps of:
- configuring a two-layer structure, in that,
o a first layer configured to generate predictions; and
o a second layer with a meta-learner configured to be trained with generated predictions from said first layer and using synthetic pollen count data.

Documents

Application Documents

#	Name	Date
1	202341062744-PROVISIONAL SPECIFICATION [19-09-2023(online)].pdf	2023-09-19
2	202341062744-PROOF OF RIGHT [19-09-2023(online)].pdf	2023-09-19
3	202341062744-POWER OF AUTHORITY [19-09-2023(online)].pdf	2023-09-19
4	202341062744-FORM FOR STARTUP [19-09-2023(online)].pdf	2023-09-19
5	202341062744-FORM FOR STARTUP [19-09-2023(online)]-1.pdf	2023-09-19
6	202341062744-FORM FOR SMALL ENTITY(FORM-28) [19-09-2023(online)].pdf	2023-09-19
7	202341062744-FORM 3 [19-09-2023(online)].pdf	2023-09-19
8	202341062744-FORM 1 [19-09-2023(online)].pdf	2023-09-19
9	202341062744-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [19-09-2023(online)].pdf	2023-09-19
10	202341062744-EVIDENCE FOR REGISTRATION UNDER SSI [19-09-2023(online)].pdf	2023-09-19
11	202341062744-EVIDENCE FOR REGISTRATION UNDER SSI [19-09-2023(online)]-1.pdf	2023-09-19
12	202341062744-DRAWINGS [19-09-2023(online)].pdf	2023-09-19
13	202341062744-Proof of Right [17-09-2024(online)].pdf	2024-09-17
14	202341062744-FORM-5 [17-09-2024(online)].pdf	2024-09-17
15	202341062744-FORM 18 [17-09-2024(online)].pdf	2024-09-17
16	202341062744-ENDORSEMENT BY INVENTORS [17-09-2024(online)].pdf	2024-09-17
17	202341062744-DRAWING [17-09-2024(online)].pdf	2024-09-17
18	202341062744-COMPLETE SPECIFICATION [17-09-2024(online)].pdf	2024-09-17