Specification
Description:FIELD OF THE INVENTION
The invention relates to a geospatial analytics platform for retail analytics, which includes market analysis and retail location modelling. The analytical platform enables the decision makers in different organisations to conduct an in-depth analysis of their present retail business, analyzes the data to understand the retail business dynamics in an geospatial environment, market expansion by looking for new locations which are suitable for opening a new store or liasioning with new outlets for supply, distribution network assessment and distributor reach analysis, product segment analysis, and region-wise sales contribution analysis. The analytical information contributes to improvement in efficiency of the retail universe.
BACKGROUND OF THE INVENTION
The retail sector is one of the most competitive industries in the market because it depends on a lot of factors in order to succeed in the market like demographics, consumer’s preferences, tastes, right product launches at the right place, analyzing sales and sales cycles etc. Retailers need to assess their presence/coverage and gaps and provide a hassle-free service to consumers to stay in the market in the long run. A retailer purchases goods in large quantities from manufacturers, directly or through distributors or wholesalers, and then sells in smaller quantities to consumers for a profit. Retailers have to identify right buyers, right segments, which are suited to the area in which they operate.
The retail industry has seen steady growth over the years and they have managed to generate millions of jobs and billions of dollars in revenue. The global retail market size is expected to grow from $26.33 trillion in 2022 to $38.71 trillion in 2026 at a CAGR of 10.1%. The global retail market is expected to reach $39.93 trillion in 2030, at a CAGR of 6.3%. The retail market size is an important metrics for businesses, investors and analysts. It can be used to assess the health of retail sectors and to compare it with other sectors of economy. It can also be used to identify trends in consumer spending.
The retail market is constantly evolving, and in the past decade the big retail giants are leaning towards harnessing the power of geospatial analysis combined with artificial intelligence and has built their native analytics divisions for transforming data into useful information. Retailers across the globe are struggling with increasing costs, inefficient operations, massive debt, dissatisfied customers, declining sales, disrupted supply chain, unsold inventory, and upstart competition. This has prompted them to adopt technological tools to improve their business operations and outcomes.
Retailers are sitting on silos of unexplored data about their businesses, which with the power of Geospatial Artificial Intelligence (Geo-AI), Predictive and Data Analytics could transform their business and can empower them to consume, analyze and find insights in data at an unprecedented level of granularity. However, one can have a lot of data and know absolutely nothing! It has become imperative to extract information from these huge databases and understand the problem to make a good decision. A simple spreadsheet with millions of instances or a map with million points on it, though having some spatial component, baffles the human mind and it is not easy to make a good business decision based on this data. This is where the power of Geospatial Artificial Intelligence for retail becomes important. The analysis goes beyond what our brains can do through visual analysis on the map and quantifying the simple basic descriptive statistics all the way to the more sophisticated inferential statistics for testing null hypotheses and drawing conclusive evidence from the data.
The analytical platform includes an array of spatial data and advanced machine learning and artificial intelligence techniques in extracting valuable information, analyzing the extracted information to improve retail business in the Consumer Packaged Goods (CPG) space. The invention aims at improving the efficiency of the retail business and to bring forth factors responsible for improving the sales. The proposed analytical platform implements a machine learning regression technique which performs feature engineering and variable selection with an aim of understanding the dynamics of the retail business and search for potential regions for business expansion, and increase distribution network efficiency through geospatial visualization and spatial autocorrelation analysis.
US Patent 11501042 titled “Decision with Big data” provides a framework for applying artificial intelligence to aid with product design, mission or retail planning. The patent outlines a novel approach for applying predictive analytics to the training of a system model for product design, assimilates the definition of meta-data for design containers to that of labels for books in a library, and represents customers, requirements, components and assemblies in the form of database objects with relational dependence. Chinese Patent CN111295681 titled “Demand prediction using a weighted hybrid machine learning model” predicts demand for an item by receiving historical sales data for the item over a plurality of past time periods, the historical sales data including a plurality of features defining one or more feature sets. US Patent 11625736 titled “Using machine learning to train and generate an insight engine for determining a predicted sales insight” describes an insight engine, which takes into account a wide variety of information from different sources for predicting a sales insight. The insight engine is generated using machine learning. Historical customer-specific information, product-specific information, and environmental information are aggregated, based on customer, product, and/or time period, into historical customer profiles. The historical customer profiles are labeled with historical sales insights to form a training set. A machine learning algorithm is applied to the training set to generate an insight engine. The insight engine is applied to a target customer profile to determine a predicted sales insight for a target entity.
All these patent publications provide insights about specific aspects of the retail but fail to analyze the spatial data to analyze the data to each level of granularity to determine a visual map to provide specific insights for retail analytics.
SUMMARY OF THE INVENTION
A computer implemented analytical platform with geospatial data analytics for improving the retail business in the consumer goods is disclosed. The computer implemented retail analytical platform uses a set of statistical techniques and procedures for increasing sales of existing retail business and to create market expansion plans and strategies. The computer implemented retail analytical platform uses machine learning algorithms in a layered approach to gain market intelligence, analyze business drivers and determine their impact on the retail business. Furthermore, the computer implemented analytical platform performs grid level analysis of a designated region along with distribution network analysis, product segment analysis and region-wise performance analysis of retail sales.
In some embodiments, the computer implemented analytical platform may improve the understanding of the demographics of a pre-specified area, where the retail business is located. The understanding of relationships between sales, demographics and socio-economic status of the population helps retail businesses in launching the right products at the right locations.
A computer implemented analytics platform for retail business analytics for expansion of business for a specific geographical region is disclosed. The computer implemented analytics platform includes a processor connected to a memory. The memory has different modules and each module comprises computer readable instructions, which when executed by the processor perform the steps of: (a) to collect and aggregate data from one or more data source (b) to sanitize date by removing association among one or more variables (c) to select one or more variables from a given set of variables, (d) to calculate the relevance of each feature and to calculate factor weights for each of the features, (e) a one or more layers, each layer providing analysis of retail business, which determines the gaps in the retail business along with visualization of the gaps, (f) an analytical engine implementing machine learning algorithms stored in an analytical database and configured to an artificial intelligence module trained using a training data set to predict the data for closing the gaps in the retail business, (f) each layer analyzing retail business data using the analytical engine to uncover gaps in retail business, and (g) superimposing/combining/mapping one or more layer of geospatial data to determine the retail business gaps and using analytical engine to provide solution to these gaps.
In some embodiments, the demographic features may reveal their contribution to the sales of certain products as opposed to the others products.
In embodiments, different implementations may cater to different objectives related to improvement of retail business, for example, identification of gaps in business presence in the geographical region.
In some embodiments of the present invention, the computer implemented analytical platform may implement a layered solution. The layered solution is implemented, which involves analyzing each feature against the set objectives. The outcome of each layer is superimposed over the output of another layer providing additional information. Subsequently, all the outcomes of each of the layers are super imposed to analyze and provide insights.
In embodiments, the set objectives may be related to increasing sales, identifying products to launch in specific area or region, increasing efficiency of distribution network, identifying best location of stores in a specific area or region and other objectives related to retail sales.
The layered solution for retail analytics is implemented by adding one by one layer of analyzed data. This analyzed data is then superimposed over next layer of analyzed data. In this way each layer of data is analyzed separately and then combined to together to get the overall dimensions (picture) of the retail business. In some embodiments, the geospatial data layer may be added to the previously analyzed layers to capture geospatial consequence on the retail business. In some embodiments, the different layers of data may include demographic data such as include literacy, working and non-working male and female population, age-group-wise population percentages in different geographical regions, no. of households, population assets data as per census, geospatial data, outlet and distributor location data which will include active and inactive outlets and distributors, sales data linked to each outlet and distributor, the outlets which have not been approached by the business in question and the areas in the geographical region that are still to be covered by the retail business in question and location of important point of interest data including schools, colleges, market places.
In embodiments, each layer of analyzed data may be combined in different orders to produce data analytics. The analytical platform may provide layered data driven solutions by implementing and developing models to help the stakeholders make important decisions using the geospatial insights. In some embodiments, the geospatial insights provided by the analytical engine facilitate planning, managing, organizing and enhancing the retail business.
In one embodiment, the analytical platform may include one or more applications for improvement in efficiency of the retail business. The analytical platform uses a layered set of solutions, which comprises a combination of various statistical, machine learning/artificial intelligence and geospatial techniques, which provides suggestions for improving the retail business. In some embodiments, the improvement in the retail business is related to improving sales in the geographical region.
In some embodiments, the layered solution may be implemented in machine learning algorithms to make accurate predictions of the set objectives related to improvement of retail sales in various outlets. In another implementation, the layered solution may be linked to the assessment of the demographics and points of interests (POI) and other business driver densities. The performance of the outlets can be assessed based on certain key performance indicators like a minimum threshold sales target achieved. The analyses and insights provided by the analytical platform may help the decision makers to make timely interventions for improving the performance of their business especially intervening in those cases where the outlets are not showing any sales in the recent past.
In some embodiments, the analytical platform may access real time data related to new outlets acquired and related sales and on a daily/monthly/quarterly basis, which allows the application to identify highly specific, extremely valuable information using spatial information systems and custom maps. The platform has a feature of real time data ingestion and the insights are updated based on the updated data at the backend. For example, data related to outlet-wise, distributor-wise and region-wise daily, monthly and quarterly sales which may help in improvement of the retail business.
In some embodiments, the analytical platform assimilates different datasets with tagged location information and integrates it with the geo-demographic data of the location and information related to outlets, distributors, POIs, and other attributes associated with the business drivers such as property rates, other businesses in the vicinity of the existing outlets that may affect the outlet’s business and the sales of a particular product segment. Through the combined use of assimilated data and location intelligence, the analytical platform can detect patterns in the dataset and ascertain the factors responsible for success and failure of the retail business with respect to the sales. In addition, once trained with a training data set, the analytical platform may also identify factors responsible for the success or failure of the outlets.
The analytical platform may include a data aggregation module. The data aggregation module collects data related to the retail universe in a particular geographical region, geo-spatial data, government data, demographic data, and ancillary data. The aggregated data is used to prepare, train and predict the factors responsible for improving different set objectives such as but not limited to improving the sales, finding new business opportunity locations, distribution network analysis and distributor reach, product segment analysis, regional sales analysis and other set objectives for improvement of business efficiency by reallocation of outlets to distributors based on their proximity to retailers and their geolocation advantage with respect to supply of products and goods.
In addition, data aggregation module associated with the analytical platform may collect data from other sources such as proprietary data, geographical data, open source data available on the internet, data related to the retail business in question, demographic and assets data, socio-economic classification (SEC) data or some other type of data for prediction and improvement of retail business.
In some embodiments, the analytical platform may be provided with ancillary data. The ancillary data may include property rates, residential and commercial zones, residential, commercial and connectivity index, and road network of the geographical area under consideration. All the three indices, residential, commercial and connectivity, are measured on a scale of 1 to 3, the highest score 3, indicating the dominance of a particular feature in that area. For example, connectivity index of 3 means that the area is very well connected with a dense network of streets. A unique feature of the analytical platform is combining different data in the data aggregation module with the geospatial data in the layered form. This improves the efficiency, prediction and provides additional insights for determining the gaps in coverage and to improve the business efficiency. The analytical platform also includes a geospatial module, which integrates and combines geospatial data with other data to improve prediction.
The analytical platform may further include a dimensionality reduction module to select the relevant variables from a given set of input variables. The dimensionality reduction module may sanitize the aggregated data, for example, removing association among different variables. For example, removing multicollinearity, autocorrelation and other types of data discrepancies such as interrelation among one or more variables. This interrelation among different variables may be linear, exponential or logarithmic. In embodiments, a statistical procedure for removing the multicollinearity among input variables (dimensions) may depend upon the type of collinearity in the data. Other inconsistencies like outliers, duplicates and other forms of erroneous data are also removed.
A feature engineering module provides identification and selection of features/independent variables for creating a prediction model. The feature engineering module calculates the importance of each feature by calculating factor weightage for each of the features. The parameters are constant values that are provided to the variables during model building to create multivariate statistical prediction equations /models.
The analytical platform further includes an analytical engine implementing machine learning algorithms, which are trained using a training dataset and tested using a test dataset comprising one or more variables selected by the feature engineering module to optimize the set goals and analyses of retail business and its associated sales. The analytical platform may utilize the test data to train and develop one or more machine learning models or artificial intelligence modules to predict the sales and provide insights for its improvement. In some embodiments, the test dataset is passed to the machine learning model for the first time and is used for calculating the accuracy of the machine learning model.
In some embodiments, the analytical platform may include an artificial intelligence module, which may implement one or more machine learning modules and algorithms for feature selection and the resulting variables may be used for the prediction of sales.
A recommendation module provides prediction of the outcomes based on the set objectives that are provided as recommendations. Furthermore, the recommendations are stored in the database, which can be retrieved later for calculating the accuracy of the analytical platform by comparing the predictions with actual results. In addition, the cross validation of the artificial intelligence/machine learning model is performed to check the robustness of the machine learning model.
In other embodiments, a computer implemented method and system for deriving the analytics of retail business for assessment of reasons for success and failure of the outlets in selling the products is implemented using geospatial, geodemographic and SEC analysis, data visualization and machine learning multivariate model. In some embodiments, the analytical platform may evaluate the sales performance at a regional level and factors responsible for sales performance of the region and also evaluate the region against the set key performance indicators (KPIs).
In some embodiments, the analytical platform may perform the geo-demographic profiling wherein the locations of active and inactive outlets, active and inactive distributors and wholesalers are analysed with respect to the demographics of the region. The relationships between the retail business elements and the demographics like the population density, working and non-working population, literate and illiterate population, high and low affluence zones is deduced. The geodemographic profiling gives the preliminary analysis of the variables that might be impacting the retail business. The analysis may provide information related to the performance of the outlets in relation to these variables.
In some embodiments, the analytical platform may perform an analysis where it finds the correlation between the socio-economic classification (SEC) A, B, C, and D factors with income and age like per capita income, the income groupings e.g. 0.75 lacs to above 10 lacs, various age groups. The SEC groups are defined as follows:
SEC A - Includes the most affluent households, where the chief wage earner is a businessman or professional or an executive/manager and his/her highest educational qualification is graduate/post-graduate;
SEC B - Includes second most affluent households, where the chief wage earner of these households is a school educated business man or graduate/post graduate skilled worker;
SEC C - Includes households in the middle rung of affluence, where the chief wage earner of these households is a skilled worker and his/her highest educational qualification is school educated but not graduate;
SEC D and E - Represents households in low-income levels, where the chief wage earner of these households is either a primary level educated skilled worker or an unskilled worker.
In some embodiments, the analytical platform may perform market expansion analysis. To this end, exploratory data analysis, correlation analysis and feature engineering is carried out to determine the most critical business drivers affecting sales using sales as an independent variable and all other demographics, assets, POIs datasets as independent variables. The feature engineering is performed using the random forest decision tree regressor model. The model assigns weights to each variable which also tells the feature importance. The sales potential of each region is calculated by taking the product of the variables weights with the variable count in each region.
W=(∑_(i=1)^n w_i X_i)/(∑_(i=1)^n w_i )
The analysis of potential versus coverage and population density versus coverage is performed and quadrant charts are drawn to check the gaps in expansion. The region where the region potential is high and coverage is very less is selected as a potential location for expansion. Similarly, an area where the population is very dense but the coverage is low is also selected as a potential region for expansion.
In some embodiments, the analytical platform may perform a prioritized region wise expansion analysis. Multiple maps of all critical business drivers are prepared and these geospatial maps are combined to generate one single map which is the intersection of all critical features. This map is used to generate a business driver hotspot map. The potential regions are prioritized on the basis of their vicinity to the business driver hotspots (which takes into account the most critical POIs affecting business like schools or colleges) and the socio-economic clusters. The regions are classified into three different categories i.e. High, medium and low priority expansion. The businesses need to conduct ground validation and then start to operationalize the expansion plans.
In some embodiments the analytical platform will analyse the distribution network with respect to gaps in the distributor presence in various geographical regions within the study area, best and worst performing distributors, distributors active in the past 4 years which included the COVID hit durations, years 2019 and 2020. A geospatial analysis of distributor presence is done and a geospatial map of region-wise gaps in distributor presence is created highlighting the percent area with no distributor presence.
In other embodiments, the computer implemented analytical platform may perform distributor reach using geospatial buffer analysis. Buffer zones are created around each distributor where the buffers are drawn at different radial distances from the center based on the outlets catered by that particular distributor. The procedure entails analysis of each distributor’s coverage of a geographical region, the number of outlets linked to it and the average sales generated in one quarter in each buffer ring. The analysis reveals the distributor performance with increasing distance from its location based on mechanised and non-mechanised delivery. For example, for one of the distributors it was found that within a distance of 2 km, the distributor covers 149 outlets with a cumulative sale of Rs. 687 and as he moves a km further, he covers additional 406 outlet but the sales only marginally improve from Rs. 687 to Rs. 737. However, as we review the remaining outer rings, his sales are proportionately rising in line with the outlet count. It is only within the 3 km ring that the sales are not optimized. Hence, it was deduced that this circle needs intervention. Hence it is recommended that the general trade retail operators can either be realigned to another distributor or the sales potential of this ring can be reassessed.
In other embodiments, the analytical platform will perform a spatial analysis of the buffer zones of distributors with respect to the wards in which they are located. The buffer zones/concentric circles are placed on top of color coded wards where different colors represent the wards contributing to the bottom 25% quarterly sales and wards with 50% and 75% sales contribution. This spatial analysis reveals the performance of the distributor as reflected by the wards in which the distributor and its associated outlets are situated.
In some embodiments, the analytical platform will run a grid level analysis where the study region is divided into 500m x 500m grids. All grids where there is no presence of the business are extracted and the percent coverage of the retailers is calculated. In an example analysis it was found that the business only covered 25% of the region with respect to outlet presence. The 75% is still to be explored. The grids with high business driver densities are extracted and may be recommended for market expansion.
In some embodiments, the analytical platform will perform a grid-wise specific product segment analysis. The complete region is divided into 500m x 500m grids. A product segment is chosen based on its high sales record and the goal is to find those regions where this product has very good scope of sale but is being poorly supplied. A feature engineering analysis reveals the variables or the business drivers that are affecting the sales of this product segment. A grid map of current retailer outlets is created and those grids are extracted where the retailer has its outlets but is not supplying this particular product. One geospatial layer of each variable is created and the intersection of those grids where there are retailer outlets present and also the business drivers are present but the segment is not being supplied is extracted. These grids become the supply point for the product segment. Additionally, those grids are also extracted where there are outlets which do not belong to the retailer and the business drivers are also present. The recommendations include the location of these grids which have potential outlets which can be reached out for the supply of the product segment in question.
In one variation of this implementation, a computer implemented method and system may analyse the region-wise sales and the reasons behind the high/low and no sales. A geospatial analysis is carried out for finding those regions which are contributing to the top 50% sales, top 75% sales and bottom 25% sales of the company. These regions are labelled as category A, Category B and Category C regions respectively. As a first step a machine learning model is created to conduct a feature engineering analysis and find the critical business drivers which are responsible for high sales in these regions. In an exemplary analysis, it was found that the most important features contributing to sales as per the machine learning model are population density, schools, colleges, affluence/property rates, no. of households, household workers, labour class (marginal workers, cultivators, agriculture labour) and children below 6 years which agrees with our intuitive perception. Later on, it was revealed by the model that the highest selling segment is very low priced which mostly appeals to children and majorly sells in the regions where a very low-income group resides. This analysis may help the retailer to seek those regions for expansion where schools/colleges or any other critical business drivers are present in large numbers.
In some embodiments, the analytics platform accesses different types of data related to improvement and assessment of the quality of business to identify highly specific, extremely valuable information using geospatial information and custom maps.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates an operating environment of a computer implemented analytical platform for retail analytics and analysis in an embodiment of the present invention;
Fig. 2 illustrates different hardware components of a computer implemented analytical platform for retail analytics and analysis in an embodiment of the present invention;
Fig. 3 illustrates different sub-modules of an analytical module for improving the quality of retail business and sales analysis in an embodiment of the present invention;
Fig. 4 illustrates different modules in a data aggregation module in an embodiment of the present invention;
Fig. 5 illustrates a list of variables used by the analytical platform for retail analytics for improving the quality of retail business and sales analysis in an embodiment of the present invention;
Fig. 6 illustrates different components of a dimensionality reduction module in an embodiment of the present invention;
Fig. 7 shows the demographic variables and their values for an exemplary geographic region in an embodiment of present invention;
Fig. 8 illustrates the heat map of different demographic variables and business drivers in an embodiment of the present invention;
Fig 9 illustrates important features affecting the sales of the retail business in the geographical area under study in an embodiment of the present invention;
Fig. 10 illustrates a statistical analysis module in an embodiment of the present invention;
Fig. 11 illustrates different components of a feature engineering module in an embodiment of the present invention;
Fig. 12 illustrates the region-wise quarterly sales of active distributors and active outlets in an embodiment of the present invention;
Fig. 13 illustrates the geographical region in a grids form providing information related to number of retail outlets in a specific geographical region in an embodiment of the present invention;
Fig. 14 illustrates an exemplary image of distributor network and distributor reach for retail business and sales in an embodiment of the present invention;
Figure 15 illustrates the distributor reach analysis table which shows the variation of average quarterly sales with increasing distance from the distributor in an embodiment of the present invention;
Fig. 16 illustrates a geospatial analysis module in an embodiment of the present invention;
Fig. 17 illustrates a process of analysis of retail business sales in an embodiment of the present invention;
Fig. 18 illustrates a process of geospatial analysis in an embodiment of the present invention;
Fig. 19 illustrates a process of analysis of retail business sales and distributor network in an embodiment of the present invention;
Fig. 20 illustrates a process of market expansion with region wise geospatial maps in an embodiment of the present invention;
Fig. 21 illustrates a process of analysing region wise gaps in distributor network in an embodiment of the present invention;
Fig. 22 illustrates a process of determining action required by distributor network for increasing retail sales in an embodiment of the present invention;
Fig. 23 illustrates a process of analysis of retail business for improvement of sales using grid analysis in an embodiment of the present invention;
Fig. 24 illustrates a process of alignment of products and sales for increasing retail business in an embodiment of the present invention;
Fig. 25 illustrates a process of recommendations regarding expanding in grids with gaps in an embodiment of the present invention;
Fig. 26 illustrates a process of analysis of retail business sales and distributor network in an embodiment of the present invention;
Fig. 27 illustrates a process of improvement of sales for retail business in an embodiment of the present invention;
Fig. 28 illustrates a process of data filtering and data transformation for retail sales in an embodiment of the present invention;
Fig. 29 illustrates the process of determination of weight to be attached with each variable for feature analysis in an embodiment of the present invention;
Fig. 30 illustrates the process of analysing low/high sales in a geographical region in an embodiment of the present invention, and
Fig. 31 illustrates the overall process of retail business expansion, sales analysis and enhancement of distribution network efficiency in an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
As used herein any reference to “one embodiment” or “an embodiment” means that a particular element, feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. The appearances of the phrase “in one embodiment” or “in one implementation” or “in variation of the implementation” at various places in the specification are not necessarily all referring to the same embodiment.
As used herein, the terms “comprises,” “comprising,” “includes,” “including,” “has,” “having” or any other variation thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The use of the “a” or “an” are employed to describe elements and components of the embodiments herein. This is done merely for convenience and to give a general sense of the invention. This description should be read to include one or at least one and the singular also includes the plural unless it is obvious that it is meant otherwise.
Fig. 1 illustrates an operating environment of a computer implemented analytical platform for retail analytics and retail business analysis in an embodiment of the present invention. The computer implemented analytical platform environment 100, which includes an analytical platform 110. The analytical platform 110 is connected with one or more geographical areas, for example, a geographical area 102 and/or a geographical area 104.
Each geographical area 102 comprises one or more outlets 120A, and one or more distributors 130A, and one or more points of interests (POI) 140A, and the area demographics 150A associated with demographics of the geographical area 102. Likewise, the geographical area 104 comprises one or more outlets 120B, and one or more distributors 130B, and one or more points of interest (POI) 140B, and one or more demographics 150B. In other variation of the invention, the retail analytical platform 110 may include more than two geographical areas and two geographical areas are shown merely for illustration purpose.
The outlets such as the outlet 120A, the outlet 120B, may be collectively referred to as Outlets 120. Similarly, the distributors 130A and the distributors 130 B can be collectively referred to as Distributors 130, the POIs 140A and the POIs 140 B can be collectively referred to as POIs 130 and the demographics 150A and the demographics 150 B can be collectively referred to as demographics 150. Furthermore, the retail analytical platform 110 is connected to a server 112, a database 114, and a cloud computing environment 118 and other electronic processing devices through a network 108. The retail analytical platform 110 collects and aggregates data related to one or more retail businesses. For example, the retail analytical platform 110 may aggregate data from the outlets 120, the distributors 130, the POIs 140. Furthermore, the retail analytical platform 110 may collect data related to demographics 150 for each geographical area such as the geographical area 102 or the geographical area 104. The data collected is analyzed to perform retail analytics for improving the quality of retail business and for performing sales analysis.
In some embodiments, the retail analytical platform 110 may be implemented in the server 112.
In another embodiment, the retail analytical platform 110 may be implemented over a distributed environment or on a cloud computing environment 118.
In some embodiments, the retail analytical platform 110 can be connected with one or more databases 114. The one or more databases 114 may include an ancillary database, which may store information related to one or more outlets or profile of different stakeholders associated with the specific outlet/ outlets. For example, the ancillary database may include property rates, residential and commercial zones, connectivity index, and road network in the geographical area. In addition, the ancillary database may include information related to property rates for residential and commercial areas, and connectivity index of the geographical area for one or more outlets. Fig. 1 illustrates an exemplary embodiment of the operating environment 100 of the analytical platform 110 but in other implementations the operating environment 100 may include additional or fewer components than shown in Fig. 1.
Fig. 2 illustrates the different hardware components of a computer implemented analytical platform in an embodiment of the present invention. The analytical platform 110 includes a memory 204, a at least one processor 218, an input/output module 220, a communication module 222, an internal bus 214 and an external interface 224. The internal bus 214 allows exchange of data and electrical power between the memory 204 and the processor 220, the input/output module 214 and the communication module 222 and the other modules. Additionally, the external interface 218 allows communication between external devices such as the server 112 or data received from the database 114 or from the cloud computing environment 118. The memory 204 may include one or more operating systems 208, one or more applications 210, and a retail analytical module 212. Fig. 2 illustrates an exemplary hardware configuration of the analytical platform 110 for retail analysis, however, in other implementations, the analytical platform 110 may have additional or lesser number of modules.
Fig. 3 illustrates different components of an analytical module in an embodiment of the present invention. The analytical module 212 includes a data aggregation module 302, a statistical analysis module 304, a dimensionality reduction module 308, a feature engineering module 310, a geospatial analysis module 312, an external database 328 and a retail analytical engine 320. The analytical engine 320 includes an analytical database 314, an artificial intelligence module 318, a rule-based engine 322 and a recommendation module 324. The artificial intelligence module 318 is trained using different test data sets to predict the set goals.
In embodiments, the set goals may be related to finding the rights areas for expanding the retail business. In another embodiment, the set goals may be finding the critical variables affecting sales of an outlet, or a distributor or a region.
In some embodiments, the analytical module 212 may be implemented in a distributed computing environment or as a distributed system over a network 108.
In some other embodiments, the analytical module 212 may be implemented as software in the server 112.
In some embodiments, the analytical module 212 may be implemented as software as a service (SaaS) on the cloud computing environment 118.
Fig. 4 illustrates different components of a data aggregation module in an embodiment of the present invention. The data aggregation module 302 includes a proprietary data module 402 comprising proprietary data; an enterprise data module 404 comprising government data and other enterprise business data; a demographic data module 408 comprising data collected from different demographic databases. The data aggregation module 302 further includes a research data module 410 comprising data collected from one or more databases related to research in this area. Furthermore, the research data module 410 includes data collected from different business institutions related to improvement retail business. Stakeholder data module 412 comprises data about different stakeholders and an ancillary data module 414 comprises data related to property rates, residential and commercial zones, residential, commercial and connectivity index, and road network of the geographical area.
Fig. 5 illustrates a variable list for improving the retail business and sales analysis in an embodiment of the present invention. The invention uses one or more statistical tools and the variable listed in Fig. 5 to solve the problem of improvement of retail business and to increase avenues of achieving better sales in each quarter. In some embodiments, the retail analytical platform 110 uses at least 140 variables for analysis of the retail business and select the relevant variables, which can be utilized for statistical model building. Fig. 5 illustrates the list of variables under each category. For example, there are 52 variables that can be categorized under demographics. Likewise, there are 6 variables categorized under outlets. Distributor may be characterised by 9 variables and point of interest may be characterised by 72 variables.
In some embodiments, the analytical platform 110 may use a selected number of variables out of the listed 140 variables for sales analysis. The selected variables are analyzed using statistical model to provide insights related to sales analytics.
In some embodiments, the analytical platform 110 may use a selected number of variables out of the listed 140 variables for sales analysis using the machine learning model.
In some embodiments, the analytical platform 110 may use a selected number of variables out of the listed 140 variables for sales analysis using the machine learning model and the statistical models using a layered approach. In other embodiments, the statistical techniques may be applied for data cleansing and removal of correlated data and then the cleansed data may be used for building machine learning algorithms.
In embodiments, the analytical platform 110 uses different statistical methods, which may include the Spearman rank-order correlation (SROC), a nonparametric test to explore the association strength between variables and underlying social and economic variables. These social and economic variables are drivers of change in the analytical platform 110 for assessing the dynamics of success and failure of businesses and to improve sales.
Fig. 6 illustrates different components of a dimensionality reduction module for improvement of retail business and sales analysis in an embodiment of the present invention. The dimensionality reduction module 308 in some embodiments is also referred to as variable reduction module. The dimensionality reduction module 308 comprises a statistical data cleaning module 602, a variable analysis module 604, and a variable selection module 608.
The statistical data cleaning module 602 utilises Variance Inflation Factor (VIF) for removing certain variables from the set of independent variables which would result in reduction of variance and provide an optimal variable set for prediction. A Variance Inflation Factor (VIF) is an index that provides a measure of how much the variance of an estimated regression coefficient increases due to colinearity. A VIF value less than 10 is aimed for the independent variables. The statistical data cleaning module 602 then removes multicollinearity using a stepwise variable elimination procedure. After removal of multicollinearity, 16 variables were selected out of 140 variables with the VIF value of less than 10. These 16 variables were used for model building. A VIF value of ‘infinity’ means that this independent variable has a perfect correlation with other independent variables in the dataset.
The variable analysis module 604 analyzes whether the relationship between each dependent and independent variable is significant or insignificant with the help of p-test, t-test or chi-squared test. In addition, the variable analysis module 604 also determines if there exists an interrelationship among independent variables or multicollinearity. Multicollinearity refers to the problem when the independent variables are collinear. The multi-collinearity of the dataset is calculated using VIF.
In embodiments, the variable selection module 608 may use a Random Forest Decision tree algorithm to calculate the importance scores of variables based on the reduction in the criterion used to select split points. The graph illustrated in Figure 9 shows the weightage of each variable on a scale of 0 to 1.
Fig. 7 illustrates demographic variables for a geographical area in a study area in an embodiment of present invention. In an exemplary illustration, the area of a sub-region may be 40 sq. kms; the number of active outlets in the region may be 67; the average quarterly sales may be Rs. 82462; total population may be 408000. Likewise, other attributes in a given region with the sample collected are provided in Fig. 7.
Fig. 8 shows the heat map of different business indicators and demographic variables in an embodiment of the present invention. The analytical platform 110 may initiate the process of building a model using geo-demographic profiling, which links the demographics surrounding the outlets in a specific area with the outlet sales.
The geo-demographic profiling is analyzed by the analytical platform 110. Thereafter, the analytical platform 110 performs the data analysis with the objective of finding the demographic variables contribute to the retail sales in that geographical area. The machine learning module 318 implements machine learning model is trained with the sales data, the demographic, POI, and other business-related variables as input to deduce the variables contributing to sales of an area.
In some embodiments, the outlet/distributor/region is evaluated against the standards set by the retailer, for example, based on the sales exhibited by an outlet in a quarter, it is categorized as Platinum (sales per day >Rs. 1250), gold (Rs. 750-1249), silver (Rs. 500-749) and bronze (
Documents
Application Documents
| # |
Name |
Date |
| 1 |
202411028061-STATEMENT OF UNDERTAKING (FORM 3) [05-04-2024(online)].pdf |
2024-04-05 |
| 2 |
202411028061-PROOF OF RIGHT [05-04-2024(online)].pdf |
2024-04-05 |
| 3 |
202411028061-POWER OF AUTHORITY [05-04-2024(online)].pdf |
2024-04-05 |
| 4 |
202411028061-FORM FOR SMALL ENTITY(FORM-28) [05-04-2024(online)].pdf |
2024-04-05 |
| 5 |
202411028061-FORM FOR SMALL ENTITY [05-04-2024(online)].pdf |
2024-04-05 |
| 6 |
202411028061-FORM 1 [05-04-2024(online)].pdf |
2024-04-05 |
| 7 |
202411028061-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [05-04-2024(online)].pdf |
2024-04-05 |
| 8 |
202411028061-EVIDENCE FOR REGISTRATION UNDER SSI [05-04-2024(online)].pdf |
2024-04-05 |
| 9 |
202411028061-DRAWINGS [05-04-2024(online)].pdf |
2024-04-05 |
| 10 |
202411028061-DECLARATION OF INVENTORSHIP (FORM 5) [05-04-2024(online)].pdf |
2024-04-05 |
| 11 |
202411028061-COMPLETE SPECIFICATION [05-04-2024(online)].pdf |
2024-04-05 |