Method And System For Initiating And Fulfilment Of Desired Action Over

< Back

Method And System For Initiating And Fulfilment Of Desired Action Over Cohesive Internet Hosted Platforms

Abstract: The present disclosure provides for method and system for initiating and fulfilment of desired action over cohesive internet-hosted platforms. The method includes analysis of a first input received from end-user to detect product or service to be purchased or intent of end-user, wherein the first input comprises unstructured voice command, text message, gesture, and data associated with the end-user. Further, identifying action(s) associated with the detected product or service to be purchased or intent of end-user. Communicating the identified action(s) to at least one of natively offered service(s), third party(ies) and connected IoT device(s) and receiving a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions. The process being performed while maintaining a continuous dialogue connecting each of the inputs from the end-user, therefore replicating human-to-human understanding occurring over a real physical conversation. FIG. 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

17 September 2018

Publication Number

40/2018

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

bhaskar@ipexcel.com

Parent Application

Applicants

AZTRINO ANTHROBOTICS OPC PRIVATE LIMITED

3RBM ROAD, NEAR RTO, PUNE 411001

Inventors

1. BHAVIK CHANDRASEN RUPAREL

3RBM ROAD, PUNE 411001

Specification

Claims:We Claim:
1. A method for initiating and fulfilment of desired action over a cohesive internet-hosted platform, comprising:

receiving, by a receiving module, a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user,
wherein the input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices;

analysing, by an intent detection module, the first input to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user,
wherein the detection being performed by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module;

generating, by the intent detection module, a plurality of parameters with key-value pairs with respect to the detected intent of the end-user, wherein the key is a unit of identification of a particular set of information, and the information being the value of the corresponding key;

identifying, by the persistence module, one or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user;

persisting conversations, by the persistence module, by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory,
wherein the generated plurality of parameters with key-value pairs being updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user;

communicating, by an action module, information associated with the confirmed identified one or more actions to at least one of one or more natively offered services, one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices,
wherein the one or more third parties are connected to the cohesive internet-hosted platforms for providing at least one of the product and the service to the end-user,
wherein the one or more connected IoT devices are configured to perform an action in compliance with the communicated action;

receiving a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions,
wherein the interested one or more third parties and the one or more native services are ranked by quality determination module by appraising the response on the basis of at least one or more of product or service relevance for the end-user, price for end-user, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary; and

enabling the end-user to select one of the native service and the third party from the ranked one or more third parties and one or more native services.

2. The method as claimed in claim 1, wherein the unstructured voice command being converted into text corresponding to at least one of a language used in the unstructured voice command.

3. The method as claimed in claim 1, wherein at least one of the converted text and the text message being processed by a natural language processing module before being analysed, by an intent detection module, to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user.

4. The method as claimed in claim 1, wherein the gesture comprises fingers’ movement, hand movement, head movement, head nodding, leg movement, eye movement, eye blinking, and extension of pupils.

5. The method as claimed in claim 1, wherein one or more gestures are identified in the gesture input before the analysis by a gesture identification module, wherein the one or more gestures are identified in the one or more gestures input by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module, where each of the gesture is provided with information associated with specific intent and a corresponding action.

6. The method as claimed in claim 1, wherein the data captured by one or more sensors comprises external data and internal data,

where the internal data represents at least one of a heart rate of the end-user, a pulse rate of the end-user, a blood oxidation level of the end user, an electroencephalogram (EEGs) of the end user, an electrocardiography (ECGs) of the end user, body temperature of the end user, and
where the external data represents parameters associates with surroundings of the end-user.

7. The method as claimed in claim 1, wherein the information associated with the ranked one or more third parties being communicated to the end-user via at least one of a display unit, a haptic unit, one or more connected IoT devices and an audio unit.

8. The method as claimed in claim 1, wherein the first input and the second one or more inputs being received from the end-user via communicatively connected at least one of the display unit, a camera, the haptic unit, the IoT devices and a microphone unit.

9. The method as claimed in claim 1, wherein the response received in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions comprises at least one of the bid rate for the intermediary, price for end-user, information associated with one of the product and the service, time required in fulfilling the identified one or more relevant actions, and alternatives and suggestions if the product or service request in the relevant action is changed or discontinued.

10. The method as claimed in claim 1, further comprises suggesting, by an internal predictive module one of one or more alternatives and corrections to the user based on relevance of response and the price for end-user from the one or more third parties interested in fulfilling the predicted one or more relevant actions.

11. The method as claimed in claim 1, further comprises arranging for home delivery of the placed order, wherein arranging comprises one of allocating a human resource for pickup and delivery of the placed order and instructing a connected third party for the home delivery.

12. The method as claimed in claim 1, further comprises predicting future needs of the user by the persistence unit, wherein the prediction being performed by

analysing frequency and type of the inputs of the user and corresponding intents to identify a pattern;

predicting one or more future requirements of the user based on the identified pattern;

prompting the user to confirm before initiating action relevant to fulfil the predicted one or more future requirements of the user,
initiating action, upon receiving confirmation from the user, to fulfil the predicted one or more future requirements, wherein the initiating action comprises communicating the predicted future requirement to at least one of the matching one or more third parties and one or more connected IoT devices.

13. The method as claimed in claim 1, further comprises scheduling one or more actions based on one of the predicted one or more future requirements of the user and express input for scheduling a task by the user.

14. The method as claimed in claim 1, further comprises making a payment to the selected third party for initiating the identified one or more relevant actions.

15. The method as claimed in claim 14, wherein the payment is made online via one of a connected bank account, Interactive Voice Response (IVR) payment option, credit card, debit card, and digital wallet with or without input of the end user.

16. A system for initiating and fulfilment of desired action over a cohesive internet-hosted platform, comprising:

a receiving module, configured to receive a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user,
wherein the input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices;

an intent detection module, configured to analyse the first input to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user,
wherein the identification being performed by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module;

parameter generation module, configured to generate a plurality of parameters with key-value pairs with respect to the detected intent of the end-user, wherein the key is a unit of identification of a particular set of information, and the information being the value of the corresponding key;

a persistence module, configured to
identify one or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user,
persist conversations by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory,
wherein the generated plurality of parameters with key-value pairs being updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user;

an action module, configured to communicate by information associated with the confirmed identified one or more actions to at least one of one or more natively offered services, one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices,
wherein the one or more third parties are connected to the cohesive internet-hosted platforms of providing the product and the service to the end-user,
wherein the one or more connected IoT devices are configured to perform an action in compliance with the communicated relevant action;

response processing module, configured to
receive a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions,
wherein the interested one or more third parties are ranked by quality determination module by appraising the response on the basis of at least one or more of product or service relevance, price for end-user, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary; and
enable the user to select a third party from the ranked one or more third parties.

Dated this the 14th day of September 2018

Signature

Vidya Bhaskar Singh Nandiyal
Patent Agent (IN/PA-2912)
Agent for the Applicant , Description:FIELD OF INVENTION
[0001] Embodiments of the present invention relate to an intelligent system for assisting users in performing a desired task, and more particularly to method and system for initiating and fulfilment of desired action over cohesive internet-hosted platforms.
BACKGROUND
[0002] Traditionally, there have been two modes for carrying out any purchase or any particular task-based action: Physical – over the counter – such as a grocery vendor, medical shop or a mall where a person has to go to do a task; and Telephonic/Email/Mail et al. – calling or communication with another physical person to achieve a particular task. In past decade, with increase in the reach of internet to masses and the advancement in internet-based systems, one more mode of Web/Computer/Mobile App is added. More and more people are switching to online Web/Computer/Mobile App for availing services as well as products by using keyboards, mouse navigation, touch and other user controlled devices to complete a “purchase” or fulfil a desired action.
[0003] Furthermore, with the emergence of Artificial Intelligence and Machine Learning technologies the service providers are trying to augment the existing online systems and provide easy and simpler ways of carrying out any desired action, including purchase, sale, availing services and so on.
[0004] Even with the ongoing research, there is no way yet to make purchase, or achieve a task, and basically participate in commerce with human voice or gesture or combination thereof. The existing state of art, when it comes to commerce conducted over voice-based or gesture-based or other forms of human input based platforms, is struggling as it is limited by either the availability of options given to the end-user, or the ease-of-use & flexibility in terms of selecting the desired option in a way that is intuitive to the user.
[0005] Even with present state of art, it has been unable to achieve the “gold standard” with respect to seamlessness of a human-to-human voice communication for a computer-human voice communication model. Present state of art of human-computer vocal communication has the need to “invocate” a particular product, service or brand in order to fulfil a task, and are not cohesive in nature, thus creating a barrier of use. The presently used systems are not natural and are unable to replicate human-to-human understanding occurring over a real physical conversation because they do not persist conversations as accurately. In addition, present state of art is limited to chat type systems which basically follows the model of CICO “chat-in chat-out” (Chats in are commands received from end-user, chat outs are their subsequent response). Such chat based systems grossly limits the options that can be presented to the user at one-point of time (as one chat can only contain one response and multiple chat-outs can be overwhelming for the user to read & act on together), while also limiting the flexibility & usability of choosing the right option to the end user (as since only one chat-in can be received & processed by the current state of art at one time).
[0006] The human voice or gesture-controlled systems or methods lack the capability of understanding intent from a regular human to human-like conversation or talk. Instead, the users need to learn and follow the specified formats or guidelines for operating such systems or making use of such methods. Therefore, a user cannot casually give a voice command in the manner akin to conversation with another person.
[0007] Hence, there is a need for an improved method and system for initiating and fulfilment of desired action with the help of voice or gesture command to address the aforementioned issues.

SUMMARY
[0008] In accordance with an embodiment of the present disclosure, a method for initiating and fulfilment of desired action over cohesive internet-hosted platforms is provided.
[0009] The method includes receiving a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user. The input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices.
[00010] The method also includes analysing the first input to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user. The detection being performed by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module.
[00011] The method also includes generating a plurality of parameters with key-value pairs with respect to the detected intent of the end-user, wherein the key is a unit of identification of a particular set of information, and the information being the value of the corresponding key.
[00012] The method also includes identifying one or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user.
[00013] The method also includes persisting conversations by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory. The generated plurality of parameters with key-value pairs being updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user.
[00014] The method also includes communicating information associated with the confirmed identified one or more actions to at least one of one or more natively offered services, one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices. The one or more third parties are connected to the cohesive internet-hosted platforms of providing the product and the service to the end-user. The one or more connected IoT devices are configured to perform an action in compliance with the communicated relevant action.
[00015] The method also includes receiving a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions. The interested one or more third parties are ranked by quality determination module by appraising the response on the basis of at least one or more of product or service relevance, price for end-user, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary.
[00016] The method further includes enabling the user to select one third party from the ranked one or more third parties.
[00017] In a further embodiment of the present disclosure, the method further comprises suggesting one of one or more alternatives and corrections to the user based on the relevance of response and the price for end-user from the one or more third parties interested in fulfilling the predicted one or more relevant actions
[00018] In another further embodiment of the present disclosure, the method further comprises arranging for home delivery of the placed order, wherein arranging comprises one of allocating a human resource for pickup and delivery of the placed order and instructing a connected third party for the home delivery.
[00019] In yet another further embodiment of the present disclosure, the method further comprises predicting future needs of the user by the persistence unit, wherein the prediction being performed by following - analysing frequency and type of the inputs of the user and corresponding intents to identify a pattern; predicting one or more future requirements of the user based on the identified pattern; prompting the user to confirm before initiating action relevant to fulfil the predicted one or more future requirements of the user; and initiating action, upon receiving confirmation from the user, to fulfil the predicted one or more future requirements, wherein the initiating action comprises communicating the predicted future requirement to at least one of the matching one or more third parties and one or more connected IoT devices.
[00020] In a further embodiment of the present disclosure, the method further comprises scheduling one or more actions based on one of the predicted one or more future requirements of the user and express input for scheduling a task by the user.
[00021] In another further embodiment of the present disclosure, the method further comprises making a payment to the selected third party for initiating the identified one or more relevant actions.
[00022] In another embodiment of the present disclosure, a system for initiating and fulfilment of desired action over cohesive internet-hosted platforms is provided.
[00023] The system comprises a receiving module, configured to receive a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user. The input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices.
[00024] The system also comprises an intent detection module, configured to analyse the first input to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user. The identification being performed by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module.
[00025] The system also comprises parameter generation module, configured to generate a plurality of parameters with key-value pairs with respect to the detected intent of the end-user, wherein the key is a unit of identification of a particular set of information, and the information being the value of the corresponding key.
[00026] The system also comprises a persistence module, configured to identify one or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user, and persist conversations by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory. The generated plurality of parameters with key-value pairs being updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user.
[00027] The system also comprises an action module, configured to communicate by information associated with the confirmed identified one or more actions to at least one of one or more natively offered services, one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices. The one or more third parties are connected to the cohesive internet-hosted platforms of providing the product and the service to the end-user. The one or more connected IoT devices are configured to perform an action in compliance with the communicated relevant action.
[00028] The system also comprises a response processing module, configured to receive a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions. The interested one or more third parties are ranked by quality determination module by appraising the response on the basis of at least one or more of product or service relevance, price for end-user, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary; and enable the user to select a third party from the ranked one or more third parties.
[00029] To further clarify the advantages and features of the present disclosure, a more particular description of the disclosure will follow by reference to specific embodiments thereof, which are illustrated in the appended figures. It is to be appreciated that these figures depict only typical embodiments of the disclosure and are therefore not to be considered limiting in scope. The disclosure will be described and explained with additional specificity and detail with the appended figures.
BRIEF DESCRIPTION OF THE DRAWINGS
The disclosure will be described and explained with additional specificity and detail with the accompanying figures in which:
[00030] FIG. 1 is a flow diagram representing method for initiating and fulfilment of desired action over cohesive internet-hosted platforms in accordance with an embodiment of the present disclosure;
[00031] FIG. 2 is a flow chart representing execution of an action through a cohesive internet-hosted platform of FIG. 1 in accordance with an embodiment of the present disclosure;
[00032] FIG. 3 is flow chart representing conversation to execution of the action in accordance with an embodiment of the present disclosure;
[00033] FIG. 4 is flow chart representing processing of input received in the form of gesture and identification of the intent of the end-user in accordance with an embodiment of the present disclosure;
[00034] FIG. 5 depicts the system for initiating and fulfilment of desired action over cohesive internet-hosted platforms in accordance with an embodiment of the present disclosure; and
[00035] FIG. 6 is a block level diagram of cohesive internet-hosted platform in accordance with an embodiment of the present disclosure.
[00036] Further, those skilled in the art will appreciate that elements in the figures are illustrated for simplicity and may not have necessarily been drawn to scale. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the figures by conventional symbols, and the figures may show only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the figures with details that will be readily apparent to those skilled in the art having the benefit of the description herein.
DETAILED DESCRIPTION
[00037] For the purpose of promoting an understanding of the principles of the disclosure, reference will now be made to the embodiment illustrated in the figures and specific language will be used to describe them. It will nevertheless be understood that no limitation of the scope of the disclosure is thereby intended. Such alterations and further modifications in the illustrated system, and such further applications of the principles of the disclosure as would normally occur to those skilled in the art are to be construed as being within the scope of the present disclosure.
[00038] The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a process or method that comprises a list of steps does not include only those steps but may include other steps not expressly listed or inherent to such a process or method. Similarly, one or more devices or sub-systems or elements or structures or components preceded by "comprises... a" does not, without more constraints, preclude the existence of other devices, sub-systems, elements, structures, components, additional devices, additional sub-systems, additional elements, additional structures or additional components. Appearances of the phrase "in an embodiment", "in another embodiment" and similar language throughout this specification may, but not necessarily do, all refer to the same embodiment.
[00039] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the art to which this disclosure belongs. The system, methods, and examples provided herein are only illustrative and not intended to be limiting.
[00040] In the following specification and the claims, reference will be made to a number of terms, which shall be defined to have the following meanings. The singular forms “a”, “an”, and “the” include plural references unless the context clearly dictates otherwise.
[00041] Embodiments of the present disclosure relate to a method and a system for initiating and fulfilment of desired action over cohesive internet-hosted platforms.
[00042] In the context of the present invention an ‘Invocated platform’ would mean a platform where in order to transact or do ‘any desired activity’, a user needs to invoke a particular application. This essentially means, if the user wants to place an order for food, the user may have to say “Talk to Zomato®”. Then the invocated platform proceeds to find restaurants or cuisines the user wants based on the application-level logic defined by Zomato®.
[00043] Further, in the context of the present invention a ‘Cohesive platform’ would mean a platform where all applications come on one platform, and each invocation or “request for action” does not need to be individually triggered or invocated for the action to actually take place.
[00044] The present method and system enable end-users to fulfil one or more desired actions based on the first and second one or more inputs to detect the at least one of the products to be purchased, the service to be purchased and the intent of the end-user. The present method and system being configured to understand expressed and unexpressed intent of the user in the first and second one or more inputs. The present method and system provide for an input/command controlled virtual entity or robot, that understands what the end-user wishes or intends, understands & processes the intent, makes sense of what the end-user meant by the input/command, link it to pre-set or automatically learned actions, and execute those particular actions efficiently while communicating and keeping the end-user informed of what’s happening. There may be a plurality of desired actions that may be initiated and fulfilled by the end-users including, but not limited to, ordering food and beverages, recharging phone or any pre-paid instrument; remind & pay any sort of utility, water bills; shopping online; calling a cab or auto; booking a bus, flight or train; booking movies; getting a desired task done like laundry, pick-up and drop; and calling for daily home-maintenance chores like electrician, plumber etc. It is to be appreciated that the existing systems’ capability is limited to a level that it can remind a user for need of buy a gift for any occasion such as birthday. However, the present invention, unlike the existing systems, actually orders a gift for the user and/or delivers the gift on the birthday to the concerned person.
[00045] In the context of the present invention, an intent is the basic understanding of what the user wants to achieve. However, an action is the specific “to do” that has to be executed in order for that intent to be fulfilled. For example, “food ordering” may be the intent, the action would be “order pasta from La Pizzeria”. Intent detection on the other hand, is only responsible for understanding what the user wants to achieve, and to provide as much as possible information on that detected intent which an action determination module may then utilize to perform & predict tasks.
[00046] The present method provides for receiving one or more inputs as command from a user in a plurality of forms including, but not limited to, voice, text, gesture, touch/haptic, data associated with biophysical conditions of the end-user including, but not limited to, heart rate, blood pressure, perspiration, palpitation, EEG, ECG, and data associated with surroundings of the end-user. The input may be categorised into active inputs and passive inputs. The active inputs comprise the inputs provided directly by the end-user such as voice, text, gesture, touch/haptic. The active inputs are provided by the end-user via plurality of sensors or communicatively connected devices or IoT devices. The passive inputs comprise the inputs automatically acquired via the one or more sensors without active participation from the end-user such as the data associated with biophysical conditions of the end-user and the data associated with surroundings of the end-user.
[00047] The present method processes the input to detect a product to be purchased and/or a service to be purchased/availed and/or intent of the end-user. Based on these simple and easy forms of input, the present method initiates one or more relevant actions necessary to fulfil the detected at least one of the product to be purchased, the service to be purchased/availed and the intent of the end-user. The relevant actions are executed by an intermediary hosting a cohesive internet-hosted platform with help of one or more native services or one or more third parties. The whole method follows the easy conversation mode and enables the end-user to make purchase, achieve a task, and participate in commerce with the human voice or the gesture or data representing biophysical conditions of the end-user or the data representing surroundings of the end-user.
[00048] FIG. 1 is a flow diagram representing method for initiating and fulfilment of desired action over cohesive internet-hosted platforms in accordance with an embodiment of the present disclosure.
[00049] A first input is received from an end-user at step 102. The first input being received by a receiving module configured in the cohesive internet-hosted platform. The first input comprises information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user. The intent of the end-user may be understood as any desire or requirement of the end-user such as hunger, recharge a phone, need for entertainment, shopping, booking train or flight or bus tickets etc. In an embodiment of the present disclosure, the input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the end user captured by at least one of one or more sensors and one or more IoT devices. Therefore, the input received may be in different forms depending on the form of input chosen by the end-user and each form of the input is processed and standardised in a specific way. In an alternative embodiment, the input is received by a receiving module configured in an electronic device associated with the end-user. The electronic device configured to be communicatively connected with the IoT devices, one or more sensors configured to sense the data associated with the end user, and the cohesive internet-hosted platform. The electronic device may include hand held device, smart watch, mobile phone, laptop and desktop. The communicative connection is achieved by such as, but not limited to, WiFi, Bluetooth, Bluetooth Low Energy (Bluetooth LE or BLE), LPWAN (low-power wide-area network), LoRa (Low Range technology), Zigbee, WLAN, and mobile network or mobile data etc.
[00050] The first input being received from the end-user via communicatively connected at least one of a display unit, a camera, a haptic unit, one or more IoT devices and a microphone unit. In an embodiment, the end-user is enabled to provide the one or more inputs on the display unit of a communicatively connected electronic device associated with the end-user, where the input may be in form of typed text or selection of options. In another embodiment, the communicatively connected camera is configured to take pictures of surroundings of the end-user, desired objects, and the end-user. The surroundings include place or location where the end-user is present in real time. The desired objects may include real objects present around the end-user or the virtual objects displayed on any display screen, paper, or wall. The gestures represent a specific action desired by the end-user. In an alternative embodiment, the one or more cameras are configured to continuously record and live-stream video of the user to the cohesive internet-hosted platform.
[00051] In yet another embodiment, the communicatively connected haptic unit is configured to receive one or more inputs in the form of touch from the end-user.
[00052] In yet another embodiment, the communicatively coupled microphone is configured to capture voice of the end-user as the one or more inputs. In a scenario where the end-user clicks a picture of any desired object and then the captured image becomes the input from the end-user. Following that, the end-user may also provide another voice or text input. For example, the end-user takes a picture of a Shoe and then says I want to buy this shoe.
[00053] The first input is analysed to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user at step 104, where the analysis being performed by an intent detection module configured in the cohesive internet-hosted platform. The identification being performed by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, Human-defined logics, Machine learning, situational context, and a persistence module configured in the cohesive internet-hosted platform.
[00054] Each of the form of the input may be processed in a specific way depending on the form of at least one of the unstructured voice command, the text message, the gesture, and the data associated with the user captured by at least one of one or more sensors and one or more IoT devices.
[00055] In an embodiment, the voice input may be an unstructured voice command, where the voice command received from the end-user is not in any specified format or following any guidelines and representing conversation between two humans. The unstructured voice command is converted, by a voice to text conversion unit, into text corresponding to at least one of a language used in the unstructured voice command. Therefore, a single voice command may include multiple languages, such as mixture of words belonging to Hindi and English languages. In another embodiment, the voice to text unit may be a self-learning unit that is tailored to the accentual enunciations of the end-users’ dialect and may be multi-lingual.
[00056] In another embodiment of the present disclosure, the input may be a direct text message from the end-user. The text may be received from the end-user via a plurality of sources or apps or programs. In such an embodiment of the present disclosure, the method includes processing the converted text and/or the text message by a natural language processing module before being analysed, by the intent detection module, to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user.
[00057] The steps of processing the one of the converted text and the text message by the NLP unit, configured in the cohesive internet-hosted platform, comprises receiving the text and converting it into actionable information with the help of the intent detection module. For example, if the first input of the end-user is “get me Pasta from Vaishali”. The intent is detected by the intent detection module by using specific words & their synonyms to first understand what the end-user actually wants to achieve (such as an input of “I am hungry” can trigger the “food ordering” intent). The intent detection module is configured to understand that in this case food entity is a “pasta” and the restaurant entity is “Vaishali”. The intent detection is also made more accurate through Application-specific modules tailored towards fulfilling each type of request from the end-user. Therefore, relevant application specific module is triggered depending on the intent detected to provide relevant options to the end-user. The NLP unit swings into action when the input is either text/voice, other forms of the inputs shall bypass this and reach the intent detection module directly.
[00058] In an embodiment where the input is a combination of two forms of input such as a captured image and a voice command. Elaborating further upon aforementioned example of the end-user taking the picture of the Shoe and then saying I want to buy this shoe. In such an embodiment, image is processed to identify brand and model of the shoe, and the present method either enables the end-user to provide shoe size or the show size is fetched from the database with reference points.
[00059] The cohesive internet-hosted platform also comprises a plurality of the application specific modules, where each of the application specific modules are mapped with one or more specific intent of the end-user or product or service to be purchased by the end-user. For example, there are distinct application specific modules for recharging phone or any pre-paid instruments; remind & pay any sort of utility, water bills; shopping online; calling a cab or auto; booking a bus, flight or train; booking movies; getting a task done like laundry, pick-up and drop; and calling for daily home-maintenance chores. Therefore, depending on the detected product to be purchased or the service to be purchased or the intent of the end-user a specific module is triggered thereby providing action specific user interface and options to the end-user.
[00060] In another embodiment, the input received is one or more gestures from the end-user. The gestures comprises any user-input detected through either a movement of any part of the body, or through a sensor that measures internal body data that is not visible to a camera or the naked eye. The gesture comprises fingers’ movement, hand movement, head movement, head nodding, leg movement, eye movement, eye blinking, and extension of pupils.
[00061] In the present embodiment, the one or more gestures are identified by analysis of the pictures of surroundings of the end-user, pictures of the desired objects, pictures of the end-user and the live streamed video by a gesture identification module. Therefore, the one or more gestures are identified in the first input prior to the analysis of the first input, hence the identified one or more gestures are analysed to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user. The one or more gestures are identified in the gesture input by the one or more by one or more application specific modules based on at least one of a predictive analysis, database with reference points, a raw gesture data, the Human-defined logics, the Machine learning, the situational context, and the persistence module, where each of the gesture is mapped with information associated with a specific intent and a corresponding action. For example, a gesture of 2-quick eye blinks may indicate a confirmation, a slight extension of pupils may indicate an information expansion request, and a long eye close may indicate dissent. Another example of gesture may be nodding – in the affirmative or negative. Hand or limb movements may indicate another form of gesture. A movement of hand pointing to a particular object, and then making a gesture may indicate request to find out its price. It is to be noted that in cases where the input is the gesture, then NLP unit is not activated.
[00062] In yet another embodiment, the input received is the data associated with the end-user captured by at least one of one or more sensors and one or more IoT devices. The data associated with the user may be categorised into an internal data and an external data. The internal data represents data captured by one or more sensors representing biophysical parameters of the end-user. The biophysical parameters comprises at least one of a heart rate of the end-user, a pulse rate of the end-user, a blood oxidation level, an electroencephalogram (EEGs), an electrocardiography (ECGs) and body temperature. The external data represents data captured by one or more sensors representing parameters or situations surrounding the end-user such as, but not limited to, ambient temperature, humidity, traffic on streets, etc.
[00063] In an embodiment of the present disclosure, the input may be the internal data. In such cases, sensors might detect the pulse rate of a user, and might prompt a voice-output to the user that “you’re taking too much stress, would you like me play some soothing music or just tell a joke to calm your nerves”.
[00064] In an embodiment of the present disclosure, the input may be the external data. In an exemplary scenario, when the end-user is driving a car, and a camera sensor using at least one of LDR, RADAR, image-sensing, and AI, predicts that the traffic in front of the end-user shall take around 15 more minutes to clear out, making the end-user late for a meeting that was scheduled at 12PM. The present method will then prompt the end-user – “Would you like me to message X that you’re running late by 15 minutes because of the traffic?”.
[00065] The working of the Application Specific Modules (ASM) within Intent Detection may be understood as following:
(a) Generic Module Logic (applicable on all the modules) being configured to store, process & analyse each of the end-user’s input (the first input or the second one or more inputs) and simultaneously serve as a determining entity of intent detection. Therefore, first function of a module is to store, process & analyse whatever input has come in. This is common for each module employed by the present invention. The storage may take place in on-the-go memory or in a database. The processing & analysis will take place based on the nature and configuration of each module which is disclosed in later part of the disclosure. The second function of a module is to serve as a determining entity for the overall intent detection. After its functions are executed successfully, an ASM may output the following 3 parameters grouped on a key of the ASM: detected intent, more information about that intent which will form a part of the generated key-value pairs, Weightage – this is ranking parameter which is optionally added to the output of any module. The weightage is then considered by the master output of the intent detection as well as other linked modules. The weightage determines the preference to be given to the output of a particular module over the output of any other module, in cases where there is a conflict in terms of intent detection of one input among any ASM. Finally, all information from each of the ASM is grouped & ranked together and converted into the following key-value pair: OFI (One Fixed Intent)- The intent with the highest weightage would be the OFI. This OFI will then be used to in all interactions & processes. All ASM data- all data from the existing ASMs will also be a part of the key-value pairs. They may be optionally referenced based on the need by further modules.
(b) Situational Context Analysis (SCA) being configured to come up with an accurate context that describes the situation in which the user is currently making the input. By understanding the situation, accurate determination of intent is possible. Every input received by the cohesive platform contributes towards the making up of the situational context. If Input is made through voice, the detected tonality, volume, sentiment, placement of words and emotion is utilized to analyse and reach upon a context. The same is true when the input is made through text, except in this case the volume would not be used. If an input is available through an internal or external data sensor or a gesture, the data is then classified and matched based on its parameters and importance towards any current or subsequent queries. The situational context may refer to the other ASMs within intent detection to further refine its own context, and also provide contextual information necessary to them to accurately detect the intent. The situational context will be a dynamic string of words or numbers that can define the current situation more accurately – stored in a database with reference points (DBR), referenced in a Human-defined logics (HL) or determined more automatically by Machine learning (ML).
(c) Predictive Analysis (PA) module being configured to predict intents specific to the context of a situation, and attempt to match it with the user input, to achieve a more accurate detection of intent from the user input. The PA uses a method to assign specific weightage to intents occurring over a specific context that occurs repeatedly over a period of time. For example, the time is 5:30 pm, and through machine learning, human defined logics & database referencing, the system knows that it’s time for the end-user to leave office. Generally, a cab is called by the end-user at this point of time. Therefore, considering the context of “User is leaving office at 5:30pm, and generally orders a cab at this point of time”, the intent with the highest type of weightage from this ASM would be “cab booking”. Another aspect of predictive analysis is referencing what was done by other end-users at this point of time. As an example, if User A has performed X & Y actions in the past, and through data aggregation through the ML, the HL & the DBR the present method and system know that the other users who have performed X & Y tasks also perform Z task when there is a specific situational context, it would predict the above as the intent. Note that this is different from the “users who purchased X also purchased Y” system; simply because it also takes into consideration the specific situational context to drive more relevant analysis.
(d) Human-defined logics (HL) being configured to process input and other ASM information through a specific & pre-decided technique pre-defined by a human well in advance, so as to generate higher level of accuracy and customization in the final output of the intent. The HL works at 2 conditions – (i) to process raw input data by NLP techniques like any one of or a combination of POS tagging, segmentation, sentence breaking, stemming, extraction, semantic analysis, regular expression & summarization; and output the most relevant intent, and (ii) to further refine intents & other information detected by other ASMs through customized techniques and flows for each situational context. Once it does the above 2 tasks, it outputs information with weightage assignment to its output. Again, this module shall utilize other ASMs like the DBR and the ML.
(e) Machine learning (ML) being configured to automate all tasks of the HL and help in data aggregation. The ML uses the each of the detected intents and associated information, and combines that information with actions executed, and works out a model which teaches itself logics on how to further refine & provide better results. The ML is also responsible for aggregating predictive analysis data.
(f) Database Referencing (DBR) being configured to provide better accuracy in intent detection through the reference of pre-stored information in the form of sessions, cookies, memory, databases, third party information etc. Every single input is stored in the database, along with all detected information like intents, intent-related information, ASM information, actions, persistence-related information, replies etc. For every input, the database is also referenced for previous such inputs so as to aid the PA, the HL, the ML and the SCA.
(g) Persistence Unit (PU) being configured to interact with all other modules at every single point of time during intent detection, action determination, execution & replying. Therefore, during the Intent & Action Detection, the PU reaches out on the persisted parameters on every input to add weightage, importance and context to the detection. During the Action execution, the PU persists & pushes updates on actions that have been executed. Further, during the Replying, the PU understands what was last replied by the system, to give a more contextual reply through the HL, the ML and the DBR.
[00066] A plurality of parameters with key-value pairs is generated with respect to the detected intent of the end-user at step 106. The generated plurality of parameters with key-value pairs are stored for future references. The key is a unit of identification of a particular set of information, and the information being the value of the corresponding key. Therefore, at this step data-to-information conversion takes place by generating parameters with key, value pairs that are relevant to the intent. For example, in the “order pasta from Vaishali” example, the following parameters can be assigned from the text after the intent has been detected as “food ordering”: (“item” = “dosa”, “hotel” = “Vaishali”). Information generation can occur on tangible or intangible data i.e. data that is passed on in the text-string as well as data which is passed on along-with or data which is already previously known (example: location coordinates, user's name, food preferences learned from other sources etc.).
[00067] The key-value may be further understood as the key being a unit of identification of a particular set of information, information which is the “value” of that key, which in turn can have a multitude of more keys which consist more key-value pairs and so on. It is basically a multi-dimensional array of information and can virtually be any amount of data pertaining to the particular user request, decoded from the raw text received from the voice-to-text unit. For example, an input request including “order some pizza” may return the following key-value pairs:
User Intent = Order Food
Cuisine = Pizza
Location = Latitutde, Longitutde
(And following aspects that can be determined from the Persistence Unit)
User Likes :
Preference = Vegeterian
Toppings Preferred:
Preference1 = Olives
Preference2 = Jalepenoes
Current Activity = Travelling in Train
Train Locations:
Last Station: X
Next Station: Y
[00068] Therefore, the present invention provides for a novel approach of storing this information as key-value pairs by dividing the information into 2 heads and storing the same: 1) Information that needs to be overwritten and 2) Information that needs to be appended over existing information. For example, in a case of a mobile recharge, the end user wants to first recharge his phone ‘9999999999’. The current “remembered” parameter will be that it is a mobile number associated with the end user. If then he wants to change the mobile number to ‘8888888888’, then the parameter shall be overwritten. Further in another case where the end user wants to see recharge plans for 200 rupees first, and then wants to see plans between 200 and 300 rupees, then ‘300’ would be appended to the existing query and not overwritten. Another example on these lines is that if the user first decides to change the operator/circle of a particular mobile number during a conversation and then changes to another number, and then comes back to his original number, the present invention may determine & render the operator/circle that was earlier selected by the user. Thus, as present invention is more into application-specific commerce, it will be able to remember information in a more complex manner.
[00069] One or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user is identified, at step 108. The present invention determines the best course of action to take either through pre-existing action patterns preferred by other users, machine learning or pre-defined action sets. The identification being performed while using at least one of the pre-existing action patterns preferred by other users or pre-defined action sets, a database for referencing and expansion, a prediction module for predicting the next action, and machine learning. It is to be noted that the one or more actions may vary and differ in accordance with the product to be purchased, the service to be purchased and the intent of the end-user. For example, in case where the service to be purchased is a periodic subscription of a phone recharge, then the action would be contacting relevant service provider for requesting the recharge automatically after the first subscription. . Further, in a case where service to be purchased is calling cab or auto, then action would be contacting cab service providers operating in the region. In another example, where the end-user is hungry and he wants to eat pasta, then action would be contacting restaurants having capability to fulfil the requirement of pasta within a specified time and cost. In yet another example, where the end-user wants to have coffee, then a connected IoT based Coffee machine is instructed to make coffee. In yet another example, where the internal data of the end user suggests that the end-user is tensed, then a connected IoT based music system is instructed to play a soothing music. Therefore, based on the action to be taken, the one or more application specific modules are activated for processing. The application specific modules also provide for customised user interface and option to the end-user suitable for performing a given action.
[00070] Conversations being persisted by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory at step 110. The conversations are persisted by the persistence module, thereby flow of the whole process takes place as if two individuals are conversing with each other. In an embodiment of the present invention, the database with reference points is updated with new reference points, and the generated plurality of parameters with key-value pairs is updated with new parameters and new key-value pairs.
[00071] At the step 110, the end-users are enabled to provide their second set of one or more inputs on the identified one or more actions. The second input may be correction, alteration, and modification of the first input. The second input may also be a user-confirmation & precision. At this step, the method may or may not invoke/require a user confirmation through a message sent to the user through a Reply unit. Base on the second input of the end user a filtering may be performed, where the end-user confirmation may be dynamic, where it is not necessary that the confirmation is “Yes” or “No”. In such an embodiment, the end-user may use his voice as second input to further filter out the parameters (which will be utilized by the Persistence unit), which will again trigger the NLP unit and reach this stage while again generating/updating the plurality of parameters with key-value pairs. Further, the end-user himself can serve as a contributor in action determination, thus making it possible for the present method to achieve precision of the right action. For example, the user may want “masala dosa” - the persistence unit enables him to make that specific selection from a filtered result from the first input of word “dosa”.
[00072] During the processing of the second one or more inputs, the previously generated plurality of parameters with key-value pairs are updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user. The responsibility of the persistence unit is to “remember information” from the initial conversation, user-confirmation, reply, the next conversation and so on. This memory aspect helps the present method in maintaining continuity in the conversation and also connect each input from the end-user during the conversation to the overall relevance of the conversation.
[00073] It is to be noted that during the steps 102-110, referencing with the database with reference points is done to identify the relevant action to be taken with reference to the first and second inputs. In an embodiment of the present invention, an action unit references the database with reference points for generating the above-mentioned parameters and extract more information about those parameters, its implications, cross-reference it across other databases and information provided by third-parties in advance or in real time. In another embodiment of the present invention, the action-identification is performed based on the expanded amount of information available from the previous step. The present invention determines the best course of action to take either through pre-existing action patterns preferred by other users, machine learning or pre-defined action sets.
[00074] Information associated with the confirmed identified one or more actions being communicated to at least one of one or more natively offered services and one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices at the step 112. The communication being performed by an action module. Therefore, the cohesive internet-hosted platform of the present invention includes any one of natively offered services, third parties and IoT devices or combination thereof for processing the identified one or more actions.
[00075] The one or more third parties are connected to the cohesive internet-hosted platform for providing the product and the service to the end-user.
[00076] The one or more connected IoT devices are configured to perform an action in compliance with the communicated identified action. For example, if an end-user has given input that “I need coffee”, then an action detected in this regard would be “making coffee”. This identified action would be communicated to a communicatively coupled Coffee maker associated with the end-user. Based on the detected action, the coffee maker would be turned on to prepare one or more cups of coffee (depending on second input of the end-user, where the end user may be asked the number of cups of the coffee needed). Once the coffee is ready, the same would be communicated to the end-user.
[00077] A response is received in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions at step 114. The present invention provides for an arrangement where the hosts of the cohesive internet-hosted platforms are enabled to act as intermediary which facilitate the initiation and fulfilment of the desired action with the help of the connected third parties. The cohesive internet-hosted platforms are also configured to act as bidding platforms, where the third parties are enabled to send their bids for taking the order for the fulfilment of the desired action. In one embodiment, the bids can be pre-determined by the intermediary and service provider in advance or provided in real-time. In such embodiments, the bids received may be a ‘nil bid’ or a bid with value. In an embodiment of the present invention, the response received in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions comprises at least one of the bid rate for the intermediary, price for the end-user, information associated with one of the confirmed identified product and the service, time required in fulfilling the identified one or more relevant actions, and alternatives and suggestions if the product or service request in the relevant action is changed or discontinued.
[00078] The interested one or more third parties and the native services are ranked by quality determination module by appraising the received response on the basis of at least one or more of product or service relevance, grading of the bid, price for end-user, information associated with one of the product and the service, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary. Therefore, the end-user gets to see the most relevant third party service providers, with least involvement in the whole process.
[00079] The information associated with the ranked one or more third parties and the one or more native services being communicated to the end-user via at least one of a display unit, a haptic unit, one or more connected IoT devices and an audio unit. Sorting of information as per customer need is made possible by the present invention. Therefore, the present invention helps the end-users in making the selection of most relevant service provider.
[00080] The present invention integrates not only with the third party service providers, but also have its native-service offerings, it may decide to deliver food through its own infrastructure than utilizing third party service provider for the same.
[00081] The end-user is enabled to select a third party or a native service from the ranked one or more third parties and the one or more native services for fulfilling the desired action at step 116.
[00082] In a further embodiment of the present invention, the method further comprises making a payment to the selected third party for initiating the identified one or more relevant actions. In a scenario, the payment may be made to the third party by the intermediary via the cohesive internet-hosted platform while deducting the bid rate quoted by the third party, and the end-user makes the payment to the intermediary via the cohesive internet-hosted platform as per the cost quoted by the intermediary. In another scenario, the payment can be made to the third party directly. The payment may be made online via one of a connected bank account, credit card, debit card, Interactive Voice Response (IVR) payment option, and digital wallet with or without input of the end user.
[00083] In a further embodiment of the present invention, the method further comprises suggesting, by an internal predictive module, one of one or more alternatives and corrections to the user based on the relevance and the price for end-user from the one or more third parties interested in fulfilling the predicted one or more relevant actions. The second input of the user is parsed and processed by the internal predictive module based on the Situation-contextual response. These are situation specific responses which make the conversational replies more personalized, informative and useful. For example, the end user wants to recharge his phone for 200 rupees, but there are no plans available matching the said amount. In this scenario, the situational context may determine this and may mention the same to the user – “We couldn’t find any pack for 200 rupees, but look at these – we feel they would be more relevant to you”. Therefore, the reply to the end user is not static but changes depending upon the situation and context of the user & the current conversational transaction.
[00084] In an embodiment of the present invention, the reply output being conveyed to the end-user by help of following Text-to-voice, touch sensitive display screen, Web, e-mail, SMS, Mobile notifications, hints, haptic, lists, tables, etc. For example, in case of haptic sensations where the present invention couldn’t find what the end-user was looking for. A negative type of sensation with two-presses can be transmuted on a wrist-band or digital/smart wrist watch that the user is wearing. In another example, the reply output may also be conveyed by triggering other actions connected to devices.
[00085] The cohesive internet-hosted platform disclosed by the present method for initiating and fulfilment of desired action enables the end-users to simultaneously check and compare cost and delivery time of a same product or service being made available by various third-party service providers. The end-user is not required to repeat the whole process again and again for different third-party service providers. For example, an end-user says, “I’m looking for a ABC brand and 123 model of shoe”. Then the cohesive internet-hosted platform, which is already linked to the various third parties which are selling shoes, communicates the confirmed identified action to the linked third parties and thereafter the third-party sellers send their response with respect to the communicated identified action. Once the response is received, the cohesive internet-hosted platform is enabled to group same-items together. The cohesive internet-hosted platform is enabled to show recommended shoes to the end-user. Therefore, the present method enables the end-user to select a desired shoe, while check the price of the shoe across all the third-party sellers and make an informed purchase. The process is very easy and efficient for the end-user since he need not to repeat the process each of the third-party sellers. A similar experience is felt by the end-users when they use the present method to do something else like ordering food, recharging phone etc.
[00086] In an embodiment of the present invention, the cohesive internet-hosted platform is a cohesive “bid-based” platform. Where the third party service providers and native service offered may offer bid to the intermediary for providing/selling the relevant product or service to the end-user. In an example, where input command received from the user is “I want to book a cab” or “I’d like to book a cab for MG Road” instead of end-user providing input command as “Talk to Uber®”. Then the present invention takes the current location of the end-user, calculates the distance, and sends a request to Uber®, Ola®, Taxi4Sure®, Meeru® and any X taxi-aggregation service or taxi-provider (the various third party cab service providers available in that region). These third parties are connected to the present cohesive platform, which are not specifically “invocated” by the end-user. When the cohesive “bid-based” platform sends the end user-request to the third parties, it include a “minimum bid” rate as well, specifying the minimum bid-value if the third parties would like to fulfil that end-user’s request. If the third parties wish to fulfil the end-user’s request, then the third parties have to meet that minimum bid value and reply with a bid offer that shall consider the intrinsic value of that particular user request & the ability of the third parties to fulfil it, wherein bids can be pre-determined by the intermediary and service provider in advance or provided in real-time.
[00087] In addition, each of those third parties sends a reply back to the cohesive “bid-based” platform as to the availability, type & pricing of their various options of cabs. The cohesive “bid-based” platform then aggregates that data from each third party service provider and shows it to the end-user in real time, contextual layout mentioning all retrieved options taking into account the quality of their response, quality of the third party, as well as the quality of the bid by each third party while prioritizing the response.
[00088] Further, the cohesive “bid-based” platform enables the end user to then select the preferred option considering availability/rates/brand and select a particular one. The cohesive “bid-based” platform processes the payment internally and transfer the funds to the third party with respective contracts with that third party. In an embodiment of the present invention, the cohesive “bid-based” platform enables the end user to transfer funds to third party via the intermediaries. In alternative embodiment of the present invention, the cohesive “bid-based” platform enables the end user to directly pay to the third party. However, for the end-user, the platform is completely cohesive. The end user may track everything on cohesive “bid-based” platform’s interface itself which shall be consistent for any third party the end user chooses. This is not the case with an invocated platform, whose output for each transaction varies greatly depending upon the approach of the 3rd party being individually invocated. In order to understand the distinction between a cohesive platform and invocated platform, in the framework of an invocated platform, there is no way of intra-sharing of data within services to make it more contextual. For example, in Alexa®, if the user is talking to Zomato, there is no way Swiggy® knows that the user is looking for “South Indian cuisine from restaurants around MG Road”. The only time Swiggy® will know is when the user ends conversation with currently invocated service, and he says “end conversation” or “exit” or “cancel” and provides new command and invocates the name of the new application: “talk to Swiggy®”. Further, the user re-mentions the requirements of service: “show south Indian….”. This is where the difference starts becoming evident. It is repetitive, and inconvenient, and there is no way to solve this problem, because the primary approach of existing platforms is to replicate the “web-based” and “mobile-based” models of commerce – which is particularly “app-centric”. The dynamics of conversation mode commerce are completely different and work better on a cohesive platform.
[00089] Moreover, because the cohesive platform knows what current action end user is taking, it can offer more contextual suggestions/predict the next step & inter-link that with other services offered on the platform, which is not possible in other existing platforms, because their persistence unit is application-specific - making the platform clueless when the conversation ends with one application, which in the above example is “Uber®”. In real-world, this is comparable to short term memory loss.
[00090] To further clarify, the present cohesive platform is different from an aggregator or an aggregator of aggregators. The cohesive platform is not restrained to aggregating one specific action or service. It is open to anything and everything that involves an action. None of the existing aggregators are universal, as they do not offer “everything under one roof” like the present cohesive platform. In summary, the novelty is an input (voice/gesture/text/internal and external data) controlled assistant powered by Artificial Intelligence offering universally aggregated services, having the intra-sharing Persistence Unit, with the NLP Unit powered by the application-specific algorithms/modules, validations & the ML, over a non-invocated Action Unit on a cohesive platform.
[00091] In a further embodiment of the present invention, the method further comprises arranging for home delivery of the placed order, wherein arranging comprises one of allocating a human resource for pickup and delivery of the placed order and instructing a connected third party for the home delivery. The cohesive platform may use third party service providers or native services for arranging the home delivery.
[00092] In a further embodiment of the present invention, the method further comprises predicting future needs of the user by the persistence unit. The prediction being performed by following steps (a) analysing frequency and type of the inputs of the user and corresponding intents to identify a pattern; (b) predicting one or more future requirements of the user based on the identified pattern; (c) prompting the user to confirm before initiating action relevant to fulfil the predicting one or more future requirements of the user; and (d) initiating action, upon receiving confirmation from the user, to fulfil the predicted one or more future requirements, wherein the initiating action comprises communicating the predicted future requirement to at least one of the matching one or more third parties and one or more connected IoT devices.
[00093] In a further embodiment of the present invention, the method further comprises scheduling one or more actions based on one of the predicted one or more future requirements of the user and express input for scheduling a task by the user. For example, the end-user may schedule an Air conditioner to be turned on at a specified temperature, fan speed and for a specified duration.
[00094] FIG. 2 is a flow chart representing execution of an action through a cohesive internet-hosted platform of FIG. 1 in accordance with an embodiment of the present disclosure. The process starts after the one or more actions are identified which are associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user. At step 202, the identified action is communicated to at least one of one or more natively offered services (206) and one or more third parties (208) asking for a minimum bid value (204). At step 210, an offer and bid response is received from the one or more natively offered services (206) and one or more third parties (208). A quality determination unit (212) determines quality of the responses received and identifies which of the one or more natively offered services (206) and one or more third parties (208) is the best to avail the service (228) or product (230). The quality is determined based on response sorting (214), third party scoring metrices (216), bid grading (218). The response sorting (214) is determined based on product or service relevance (220) and price for the end-user (222). The third-party scoring metrices (216) based on brand quality (224), previous fulfilment ratio (226), and previous user feedback (226). At step 232, the identified one or more natively offered services (206) and one or more third parties (208) are aggregated based on the sorted responses. At step 234, the aggregated one or more natively offered services (206) and one or more third parties (208) are sent for execution to the action unit of the cohesive platform. Therefore, results to be displayed to the end user in aggregated format are sent to the cohesive platform, which in turn is displayed to the end user. However, if the action determined is a user confirmation, only then shall the execution take place. Alternatively, if it is a precision or modification by the user, then the action shall be further filtered or refined before it gets executed at a step 238. In such scenarios, feedback of the precision or modification on the execution of the action is sent to the user via a reply unit (240). Therefore, In case of any inputs received from the end-user on the aggregated results, the same undergoes situational contextualisation via the persistence unit (242) and then the processed information is sent for the execution.
[00095] FIG. 3 is flow chart representing conversation to execution of the action in accordance with an embodiment of the present disclosure. An input is received by an electronic device (304) from an end-user (302) via at least one of a plurality of sensors, a display unit, a camera, a haptic unit, an IoT devices and a microphone unit. At step 306, based on type of the input one or more applications configured in the electronic device associated with the end-user is activated and analysis of the input is carried out. During the analysis, the input in form of voice is converted into text by a voice to text unit (308), where the converted raw text is subsequently processed by a NLP unit (310). Further, the input in the form of text is directly sent to the NLP unit. At step 311, the converted raw text and/or the text is shared with an intent detection module (312). At step 313, where the input received in the form gesture or data sensed by the sensors the same is shared with the intent detection module (312). The intent detection module (312) detects the intent of user based on the analysis of the input. The analysis of the said input is conducted by Application Specific Modules or ASMs (336), based on at least one of a predictive analysis (338), Human-defined logic (344), Machine learning (340), situational context (346), the database with reference points (324), and the Persistence Unit (334). At step 314, a plurality of parameters with key-value pairs with respect to the detected intent of the end-user are generated by the intent detection module (312) based on the analysed input. The said parameters are then sent to the action module (316), where the action is determined (318) while acting in conjunction with persistence module/unit (334), based on Machine learning (312), database with reference points (324) along with reference and expansion (326), pre-existing action patterns and pre-defined action sets (320), and prediction of next action (322). Subsequent to the determination of action, the action module (316) proceeds with execution (328) of the determined action. Before proceeding with the execution, the said information is communicated to the end-user to receive confirmation/precision/modification at 330 in conjunction with persistence module. This way the end-user is provided an opportunity to further refine his choices, if required, before execution. The persistence module (334) communicates the determined action, the output of its execution, and related information to the end-user via a reply module (240), wherein the communication may be in the form of text or voice or display or suggestions or predictions or haptic sensations or action in the IoT or combination thereof.
[00096] FIG. 4 is flow chart representing processing of input received in the form of gesture or data captured by one or more sensors and identification of the intent of the end-user in accordance with an embodiment of the present disclosure. At step 402, sensors (410) receives input from the end user in the form of gestures (408). In an alternative embodiment, the external (404) and the internal (406) data being captured by one or more sensors related to end-user without any involvement of the end-user. Movement of the end-user is detected by analysing of live pictures or video of the end-user. Based on the form of the input received a corresponding application specific module (336) is activated and the received input is converted into information which is then communicated to the intent detection module (312) for further analysis. The intent detection module (312) detects least one of the product to be purchased, the service to be purchased and the intent of the end-user. The intent detection module (312) detects the intent of user based on the analysis of the input. The analysis of the said input is conducted by the Application Specific Modules or ASMs (336), based on at least one of a predictive analysis (338), Human-defined logic (344), Machine learning (340), situational context (346), the database with reference points (324), and the Persistence Unit (334). At step 314, a plurality of parameters with key-value pairs with respect to the detected intent of the end-user are generated by the intent detection module (312) based on the analysed input. The said parameters are then sent to the action module (316), where the action is determined (318) while acting in conjunction with persistence module/unit (334), based on Machine learning (312), database with reference points (324) along with reference and expansion (326), pre-existing action patterns and pre-defined action sets (320), and prediction of next action (322). Subsequent to the determination of action, the action module (316) proceeds with execution (328) of the determined action. Before proceeding with the execution, the said information is communicated to the end-user to receive confirmation/precision/modification at 330 in conjunction with persistence module. This way the end-user is provided an opportunity to further refine his choices, if required, before execution. The persistence module (334) communicates the determined action, the output of its execution, and related information to the end-user via a reply module (240), wherein the communication may be in the form of text or voice or display or suggestions or predictions or haptic sensations or action in the IoT or combination thereof.
[00097] In another embodiment of the present disclosure, a system for initiating and fulfilment of desired action over cohesive internet-hosted platforms is provided. FIG. 5 depicts the system for initiating and fulfilment of desired action over cohesive internet-hosted platforms. The system comprises a plurality of devices (502) associated with the end user (508), a cohesive internet-hosted platform configured in one or more servers, third party service providers (504) and native service providers (506), where all are communicatively connected to each other via a network. The plurality of devices (502) comprises the sensors, devices, camera, speakers, microphone and combination thereof.
[00098] FIG 6 is a block level diagram of cohesive internet-hosted platform (236) in accordance with an embodiment of the present disclosure. The cohesive internet-hosted platform (236) includes processor(s), and memory 606 coupled to the processor(s) 602.
[00099] The processor(s) (602), as used herein, means any type of computational circuit, such as, but not limited to, a microprocessor, a microcontroller, a complex instruction set computing microprocessor, a reduced instruction set computing microprocessor, a very long instruction word microprocessor, an explicitly parallel instruction computing microprocessor, a digital signal processor, or any other type of processing circuit, or a combination thereof.
[000100] The memory (606) includes a plurality of modules, stored in the form of executable program which instructs the processor (602) to perform the method steps illustrated in Figure 1. The memory (606) has following modules: receiving module (612), intent detection module (312), parameter generation module (608), persistence module (334), action module (316), and response processing module (610).
[000101] Computer memory elements may include any suitable memory device(s) for storing data and executable program, such as read only memory, random access memory, erasable programmable read only memory, electrically erasable programmable read only memory, hard drive, removable media drive for handling memory cards and the like. Embodiments of the present subject matter may be implemented in conjunction with program modules, including functions, procedures, data structures, and application programs, for performing tasks, or defining abstract data types or low-level hardware contexts. Executable program stored on any of the above-mentioned storage media may be executable by the processor(s) (602).
[000102] The cohesive internet-hosted platform (236) also includes Pre-existing action patterns & pre-defined action sets (320), Database with reference points and Raw gesture data (324), reference and expansion (326), predictive analysis (338), machine learning (340), human-defined logic (344), and situational contexts (346). In an alternative embodiment, the Pre-existing action patterns & pre-defined action sets (320), Database with reference points and Raw gesture data (324), reference and expansion (326), predictive analysis (338), machine learning (340), human-defined logic (344), and situational contexts (346) are being configured in separate one or more back-end servers.
[000103] The cohesive internet-hosted platform configured in one or more servers comprises a receiving module, an intent detection module, parameter generation module, a persistence module, an action module, and a response processing module. The receiving module being configured to receive a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user. The input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices.
[000104] The receiving module (612) instructs the processor(s) (602) to receive a first input from an end-user comprising information associated with at least one of a product to be purchased, a service to be purchased and an intent of the end-user. The input received is in form of at least one of unstructured voice command, text message, gesture, and data associated with the user captured by at least one of one or more sensors and one or more IoT devices.
[000105] The intent detection module (312) instructs the processor(s) (602) to analyse the first input to detect the at least one of the product to be purchased, the service to be purchased and the intent of the end-user. The identification being performed by one or more application specific modules based on at least one of a Pre-existing action patterns & pre-defined action sets (320), Database with reference points and Raw gesture data (324), reference and expansion (326), predictive analysis (338), machine learning (340), human-defined logic (344), and situational contexts (346), and persistence module (334).
[000106] The parameter generation module (608) instructs the processor(s) (602) to generate a plurality of parameters with key-value pairs with respect to the detected intent of the end-user, wherein the key is a unit of identification of a particular set of information, and the information being the value of the corresponding key.
[000107] The persistence module (334) instructs the processor(s) (602) to identify one or more actions associated with the detected at least one of the product to be purchased, the service to be purchased and the intent of the end-user. The persistence module is also configured to persist conversations by storing plurality of parameters with key-value pairs generated from second one or more inputs from the end-user on the identified one or more actions to at least one of a database and a memory. The generated plurality of parameters with key-value pairs being updated with at least one of addition of new one or more parameters, overwriting existing key-value pairs, and appending new information to the key based on the second one or more inputs, thereby maintaining a continuous dialogue connecting each of the inputs from the end-user.
[000108] The action module (316) instructs the processor(s) (602) to communicate by information associated with the confirmed identified one or more actions to at least one of one or more natively offered services, one or more third parties matching with the generated plurality of parameters with key-value pairs and one or more connected IoT devices. The one or more third parties are connected to the cohesive internet-hosted platforms of providing the product and the service to the end-user. The one or more connected IoT devices are configured to perform an action in compliance with the communicated relevant action.
[000109] The response processing module (610) instructs the processor(s) (602) to receive a response in real time from the one or more third parties interested in fulfilling the identified one or more relevant actions. The interested one or more third parties are ranked by quality determination module by appraising the response on the basis of at least one or more of product or service relevance, price for end-user, brand quality, previous fulfilment ratio, previous user feedback and a bid rate for an intermediary. The response processing module is also configured to enable the user to select a third party from the ranked one or more third parties.
[000110] The present method and system provide for a seamless human-to-computer and a computer-to-human communication while efficiently replicating human-to-human understanding occurring over a real physical conversation.
[000111] While specific language has been used to describe the disclosure, any limitations arising on account of the same are not intended. As would be apparent to a person skilled in the art, various working modifications may be made to the method in order to implement the inventive concept as taught herein.
[000112] The figures and the foregoing description give examples of embodiments. Those skilled in the art will appreciate that one or more of the described elements may well be combined into a single functional element. Alternatively, certain elements may be split into multiple functional elements. Elements from one embodiment may be added to another embodiment. For example, order of processes described herein may be changed and are not limited to the manner described herein. Moreover, the actions of any flow diagram need not be implemented in the order shown; nor do all of the acts need to be necessarily performed. Also, those acts that are not dependant on other acts may be performed in parallel with the other acts. The scope of embodiments is by no means limited by these specific examples.

Documents

Application Documents

#	Name	Date
1	201821035035-STATEMENT OF UNDERTAKING (FORM 3) [17-09-2018(online)].pdf	2018-09-17
2	201821035035-PROOF OF RIGHT [17-09-2018(online)].pdf	2018-09-17
3	201821035035-POWER OF AUTHORITY [17-09-2018(online)].pdf	2018-09-17
4	201821035035-FORM FOR STARTUP [17-09-2018(online)].pdf	2018-09-17
5	201821035035-FORM FOR SMALL ENTITY(FORM-28) [17-09-2018(online)].pdf	2018-09-17
6	201821035035-FORM 1 [17-09-2018(online)].pdf	2018-09-17
7	201821035035-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [17-09-2018(online)].pdf	2018-09-17
8	201821035035-EVIDENCE FOR REGISTRATION UNDER SSI [17-09-2018(online)].pdf	2018-09-17
9	201821035035-DRAWINGS [17-09-2018(online)].pdf	2018-09-17
10	201821035035-DECLARATION OF INVENTORSHIP (FORM 5) [17-09-2018(online)].pdf	2018-09-17
11	201821035035-COMPLETE SPECIFICATION [17-09-2018(online)].pdf	2018-09-17
12	201821035035-FORM-9 [18-09-2018(online)].pdf	2018-09-18
13	201821035035-FORM-26 [18-09-2018(online)].pdf	2018-09-18
14	201821035035-FORM 18A [18-09-2018(online)].pdf	2018-09-18
15	ABSTRACT1.jpg	2018-09-19
16	201821035035-FER.pdf	2018-10-31
17	201821035035-ORIGINALUR 6(1A) FORM 1,3,5,26,28&DIPP CERTIFICATE-210918.pdf	2019-02-12
18	201821035035-AbandonedLetter.pdf	2019-07-23

Search Strategy

1	Searchstrategy_15-10-2018.pdf