Generating Autonomous Vehicle (Av) Dataset By Processing Sourced

< Back

Generating Autonomous Vehicle (Av) Dataset By Processing Sourced Driving Scenarios

Abstract: A method and system for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios is disclosed. Driving scenarios from multi-modal input are received and converted to input text scenario. Domain aware LLMs, ontologies and Knowledge Graphs preprocess it to generate a structured data for driving scenario that contextualizes and aligns with format to be efficiently consumed by a pretrained LLM. Preprocessing captures complexity-variability of scenarios, semantic richness and ambiguity, dynamic environment interaction, and level of detail required in a structured format enables efficient consumption by LLM to generate driving scenarios in standard driving scenario format. Output is post processed by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM. This enables obtaining the machine-readable driving scenarios.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

04 May 2023

Publication Number

45/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th Floor, Nariman Point, Mumbai - 400021, Maharashtra, India

Inventors

1. DULEPET, Sanjay

Tata Consultancy Services Limited, 379 Thornall Street - 4th Floor, Edison, New Jersey - 08837, United States of America

2. AMUR, Vignesh Lakshminarayanan

Tata Consultancy Services Limited, Aura House 2nd Floor, 1 Harrison Way, Leamington Spa, Warwickshire - CV31 3HH, United Kingdom

Specification

DESC:FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:
GENERATING AUTONOMOUS VEHICLE (AV) DATASET BY PROCESSING SOURCED DRIVING SCENARIOS

Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India

The following specification particularly describes the invention and the manner in which it is to be performed.
CROSS REFERENCE TO RELATED APPLICATIONS AND PRIORITY
[001] The present application claims priority from provisional patent application number 202321031887 filed in India on 4 May 2023, the entire contents of the aforementioned application are incorporated herein by reference.

TECHNICAL FIELD
[002] The disclosure herein generally relates to the field of data processing, and more particularly, to a method and system for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios.

BACKGROUND
[003] Autonomous Vehicles (AVs) today are not commercially viable due to limitations of the existing training data that is unable to expose the AVs to ever growing edge cases.
[004] In Artificial Intelligence (AI) development, success or failure lies significantly in a data science team’s ability to handle all possible scenarios that specifically include the edge cases, which refer to those rare occurrences in how a Machine Learning (ML) model reacts to data that causes inconsistencies and interrupt the usability of an AI tool. This is especially crucial with generative AI (GenAI) taking center stage. Conventional attempts to capture maximum possible driving scenarios, specifically the edge cases or non-trivial driving scenarios, rely on tapping or sourcing data associated with recorded events or incidents such as accidents. Conventional sources to capture such data include ego vehicles, accident datasets, incidents recorded on social media. These sources in one or other way record data in a structured manner around some incident. However, the majority of traffic data and driving data is hidden in the experience, spontaneous actions and human intelligence of commutators, drivers, and observers to an unknown or unexpected situation occurring on the roads and its surrounding environment. Such experiences may or may not always be associated with an event such as an accident but has huge potential to be an edge case.
[005] With conventional data sourcing approaches used today, above mentioned valuable data goes untapped. Furthermore, even if attempts are made to capture the data, the data is huge and completely raw and requires evolved processing approaches to rightly capture quality edge case data to simulate the scenarios for AI based AV training.

SUMMARY
[006] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[007] For example, in one embodiment, a method for generating Autonomous Vehicle (AV) dataset is provided. The method includes acquiring a plurality of driving scenarios shared in a natural language via a plurality of multimodal inputs associated with a plurality of end user devices, wherein each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario, wherein the input text scenario is an unstructured data format.
[008] Further, the method includes preprocessing, using at least one of domain aware Large Language Model (LLMs), ontologies and Knowledge Graphs, the input text scenario associated with each driving scenario, wherein the preprocessed data provides a structured data format for each driving scenario that contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format. The preprocessing on the input text scenario comprises:
1) Extracting a plurality of scenario attributes associated with the input text scenario comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act.
2) Generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions.
3) Obtaining structural transformation of the structured data by adhering to schema of the standard scenario format.
4) Performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario.
5) Performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision.
6) Identifying factual content within a narrative of the input text scenario and converting it into the machine-readable driving scenario, which is a deterministic driving scenario.
7) Enhancing poorly described scenario in the input text scenario in accordance with the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed.
8) Applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative.
9) Applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format.
10) Removing personally identifiable information (PII).
11) Identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios.
12) Detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases.
[009] Further, the method includes processing, using the pretrained LLM, the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario. Furthermore, the method includes performing post processing and scenario validation on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario. The finetuned machine-readable driving scenario generated for each of the plurality of driving scenarios generates an Autonomous Vehicle (AV) dataset to be consumed for training and simulation of AVs. The post processing and scenario validation comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format
[010] In another aspect, a system for generating Autonomous Vehicle (AV) dataset is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instruction to acquire a plurality of driving scenarios shared in a natural language via a plurality of multimodal inputs associated with a plurality of end user devices, wherein each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario, wherein the input text scenario is an unstructured data format.
[011] Further, the one or more hardware processors are configured to preprocess, using at least one of domain aware Large Language Model (LLMs), ontologies and Knowledge Graphs, the input text scenario associated with each driving scenario, wherein the preprocessed data provides a structured data format for each driving scenario that contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format. The preprocessing on the input text scenario comprises:
1) Extracting a plurality of scenario attributes associated with the input text scenario comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act.
2) Generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions.
3) Obtaining structural transformation of the structured data by adhering to schema of the standard scenario format.
4) Performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario.
5) Performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision.
6) Identifying factual content within a narrative of the input text scenario and converting it into the machine-readable driving scenario, which is a deterministic driving scenario.
7) Enhancing poorly described scenario in the input text scenario in accordance with the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed.
8) Applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative.
9) Applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format.
10) Removing personally identifiable information (PII).
11) Identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios.
12) Detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases.
[012] Further, the one or more hardware processors are configured to process, using the pretrained LLM, the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario. Furthermore, the one or more hardware processors are configured to perform post processing and scenario validation on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario.
[013] The finetuned machine-readable driving scenario generated for each of the plurality of driving scenarios generates an Autonomous Vehicle (AV) dataset to be consumed for training and simulation of AVs. The post processing and scenario validation comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format
[014] In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for generating Autonomous Vehicle (AV) dataset is provided. The method includes acquiring a plurality of driving scenarios shared in a natural language via a plurality of multimodal inputs associated with a plurality of end user devices, wherein each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario, wherein the input text scenario is an unstructured data format.
[015] Further, the method includes preprocessing, using at least one of domain aware Large Language Model (LLMs), ontologies and Knowledge Graphs, the input text scenario associated with each driving scenario, wherein the preprocessed data provides a structured data format for each driving scenario that contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format. The preprocessing on the input text scenario comprises:
1) Extracting a plurality of scenario attributes associated with the input text scenario comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act.
2) Generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions.
3) Obtaining structural transformation of the structured data by adhering to schema of the standard scenario format.
4) Performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario.
5) Performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision.
6) Identifying factual content within a narrative of the input text scenario and converting it into the machine-readable driving scenario, which is a deterministic driving scenario.
7) Enhancing poorly described scenario in the input text scenario in accordance with the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed.
8) Applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative.
9) Applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format.
10) Removing personally identifiable information (PII).
11) Identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios.
12) Detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases.
[016] Further, the method includes processing, using the pretrained LLM, the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario.
[017] Furthermore, the method includes performing post processing and scenario validation on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario. The finetuned machine-readable driving scenario generated for each of the plurality of driving scenarios generates an Autonomous Vehicle (AV) dataset to be consumed for training and simulation of AVs. The post processing and scenario validation comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS
[018] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[019] FIG. 1 is a functional block diagram of a system for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[020] FIG. 2 depicts overall architecture and process flow of the system pf FIG. 1 for generating AV dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[021] FIG. 3 illustrates the downstream integration of the system of FIG. 1 for generating simulation environments for training AVs, in accordance with some embodiments of the present disclosure.
[022] FIG. 4 illustrates a flow diagram of a method for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[023] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

DETAILED DESCRIPTION OF EMBODIMENTS
[024] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following embodiments described herein.
[025] While driving on the road or observing surroundings while driving, many small or big real time scenarios can occur around. These real time scenarios or experiences are interpreted and analyzed by drivers and observers, and most times are responded with actions based on spontaneous actions and human intelligence. With such actions, an accident may have been averted, and this experience or learning remains in the individual human’s brain and goes unrecorded or untapped. However, this, if to be recorded has a technical challenge of collecting, capturing, and handling humongous data, which is in crude form and exists right from in a busy road of a metropolitan city to remote parts of the globe. Thus, today, with conventional approaches this information of driving scenarios typically including edge cases or non-trivial driving scenarios goes untapped. Non-trivial driving scenarios encompass challenging conditions such as adverse weather, heavy traffic, construction zones, and unexpected events, requiring experience and skill to navigate safely and prevent incidents. As can be understood, this untapped data is critical to understand all possible driving scenarios for making Autonomous Vehicles (AVs) training as best as possible. This is one of the important approach that can contribute largely to commercially viable AVs, since driving technology can be improved, but without all possible driving scenario understanding, system intelligence of AV will fall short.
[026] Crowdsourcing is one way of information gathering. However, maximum edge case data or non-trivial driving scenarios, captured for the AVs via crowd sourcing, needs wide dense global coverage to tap maximum possible data. Thus, it is necessary to generate input interfaces for any person interested to easily connect and share information via hassle free, easy to connect interface. Offering various easy to use and combinational input modality is one key to improve contribution of users for crowdsourcing. Another source of obtaining driving scenarios to build robust dataset covering maximum scenarios is external data sources that tap and record event information such as accidents. Furthermore, from the humongous data gathered, filtering junk or irrelevant data, extracting the right interpretation and context of data is another technical challenge to proceed with constructing the driving scenarios to be provided to machine learning (ML) models such as Generative artificial Intelligence (GenAI) model like Large Language Models (LLMs) to generate standard machine-readable driving scenarios. Raw data, specifically received in natural language, needs to be preprocessed before feeding LLMs.
[027] One of the work in literature titled “Lang2LTL: Translating Natural Language Commands to Temporal Specification with Large Language Models” by Jason Xinyu Liu et. al., proposes preprocessing of robotic tasks received in natural language to convert them to a structured format such as “Linera Temporal Logic (LTL) using LLM. LTL enables robots to understand and execute complex, sequence-dependent tasks communicated through natural language. To enable LLM to better consume the natural language and generate tasks in LTL format requires preprocessing of the natural language. However, the preprocessing focused by work is limited to requirements for LTL format. Preprocessing requirements specific to standard scenario format such as OpenSCENARIO® are not addressed, to enable LLM consume the natural language scenario to generate driving scenario in say OpenSCENARIO® format Understanding driving scenarios from Natural language to be efficiently consumed by the LLM for generating driving scenarios in OpenSCENARIO® format needs processing at next level as scenario complexities that are implicit in the natural language related to road conditions, weather conditions and so on need to be extracted by understanding the context of the description, from NL.
[028] Generating machine-readable scenarios, such as those defined by OpenSCENARIO®, from real-life user-driving narrations presents distinct challenges compared to converting text narrations of use cases into a Task-Level 1 (TL1) language for robotics such as LTL. These challenges stem from the inherent complexity and variability of real-world driving scenarios versus robotics tasks' structured and often more predictable domain. Mentioned below are the differentiators that contribute to the complexity:
Complexity and Variability of Scenarios
• Driving Scenarios: Real-life driving involves many unpredictable elements, including varying road conditions, weather, traffic patterns, and human behavior. Capturing the nuances of these elements in a machine-readable format like OpenSCENARIO® requires a deep understanding of driving dynamics and the ability to anticipate a wide range of potential situations.
• Robotics Use Cases: Robotics scenarios, while complex, tend to occur in more controlled environments with a limited set of variables. The tasks are often repetitive or follow predefined rules, making it easier to model these scenarios in a TL1 language.
Semantic Richness and Ambiguity
• Driving Scenarios: Driving narrations are rich in semantic content and often contain ambiguities that are easier for machines to interpret with extensive contextual understanding. The language used to describe driving scenarios can be highly variable and subjective, requiring sophisticated natural language processing (NLP) techniques to parse and understand.
• Robotics Use Cases: The language used to describe robotics tasks is usually more technical and precise, with less room for ambiguity. It makes it easier to convert text narrations directly into a TL1 language, as there is a more explicit mapping between the instructions and the actions the robot needs to perform.
Dynamic Environment Interaction
• Driving Scenarios: Driving requires real-time decision-making in response to dynamic environmental changes. Generating scenarios in OpenSCENARIO® involves describing static elements and modeling the interactions between the vehicle, other entities, and the environment over time.
• Robotics Use Cases: While robotics also deals with dynamic interactions, the scope is often narrower, and the interactions are more predictable. Engineers can design the environments in which robots operate to reduce uncertainty, making it easier to model these interactions in a TL1 language.
Level of Detail Required
• Driving Scenarios: To accurately simulate real-life driving scenarios, a high level of detail is necessary, including the precise behavior of the vehicle, environmental conditions, and the actions of other road users. This level of granularity is essential for testing advanced driver-assistance systems (ADAS) and autonomous driving technologies.
• Robotics Use Cases: The level of detail required for modeling robotics tasks can be lower, as the focus is often on achieving a specific outcome or performing a particular action. The environment and the interactions can be abstracted to a degree that allows for effective simulation and testing.
Standardization and Interoperability
• Driving Scenarios: The automotive industry is still working towards fully standardizing formats like OpenSCENARIO for describing driving scenarios. The lack of universal standards complicates the generation and sharing of scenarios across different simulation platforms and development teams.
• Robotics Use Cases: The field of robotics has made significant progress in developing standardized languages and protocols for describing and executing tasks. This standardization facilitates the conversion of text narrations into a machine-readable format and enhances interoperability between different systems. Also, there is no guarantee that the output of the LLMs always follow the machine-readable scenario format resulting in a valid driving scenario. Thus, there is need for post processing the output of LLMs to generate driving scenario format resulting in valid driving scenarios.
[029] Another work in the literature titled “Processing Data for Large Language Models” by Bharat Ramanathan discusses a generic preprocessing approach for consumption of data by LLMs. However, for capturing target application specific nuances (for typically driving scenario generation), preprocessing approach selected should address the needs of the target application by contextualizing the input text for the received input driving scenario narration. The rightly preprocessed data then can be efficiently and accurately translated to machine-readable scenario formats such as OpenSCENARIO® by the LLMs.
[030] Embodiments herein provide a method and system for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios. Driving scenarios that includes crowd sources data and data from external databases, is received via multi-modal input and converted to input text scenario. Domain aware LLMs, ontologies and Knowledge Graphs preprocess it to generate a structured data for driving scenario that contextualizes and aligns with format to be efficiently consumed by a pretrained LLM. Preprocessing captures complexity-variability of scenarios, semantic richness and ambiguity, dynamic environment interaction, and level of detail required in a structured format enables efficient consumption by LLM to generate driving scenarios in standard driving scenario format such as OpenSCENARIO®. Output is post processed by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM. This enables obtaining the machine-readable driving scenarios.
[031] The pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated driving scenario.
[032] Referring now to the drawings, and more particularly to FIGS. 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments, and these embodiments are described in the context of the following exemplary system and/or method.
[033] FIG. 1 is a functional block diagram of a system 100, for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[034] In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
[035] Referring to the components of the system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, desktops with user connect through user end device such as mobile phones, smart phones, personal digital assistants, and the like. The system 100 connects with a cloud environment ( as shown in FIG. 2) via the I/O interface 106, wherein GenAI models such as a pretrained LLM for scenario generation and domain aware Large Language Model (LLMs) for preprocessing to derive context for scenario generation can be deployed and then accessed via Application Programming Interfaces (APIs) for performing one or more steps of the method implemented by the system 100. Any state of the art LLMs, and domain specific LLMs in can be used and fined tuned for the task.
[036] The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, voice interface and so on and can facilitate multiple communications within a wide variety of networks N/W and protocol types including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting a number of devices to one another or to another server or devices (such as user end mobile devices) and cloud servers or cloud environment. The proposed interface option to connect with end users such as drivers , commutators, observers on road to source data for driving scenarios such as edge cases, non-trivial driving scenarios is further explained in conjunction with the overall architecture of the system 100 in FIG. 2. Additionally system 100 can acquire driving scenarios from external databases via Application Programming Interfaces (APIs (7) as depicted in FIG. 3).
[037] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory may include a plurality of modules 110 such as any Artificial Intelligence (AI) /Machine Learning (ML) models (for example, LLMs or natural Language generative Artificial Intelligence (AI) model and the like), preprocessing module (2), post processing module (3), scenario validation module (4) (depicted in FIG. 2) and so on. Various modules to convert multimodal input received from user end device to input text scenario such as Multiple modules such as Convolution Neural Networks ( CNNs), transformer based architectures, BiLSTM combined with attention etc. that are well known in the art can be accessed from the cloud or can be stored in the memory 102.
[038] Further, memory 102 may include a database 108 to store the generated AV dataset (also depicted in FIG. 2) along with sourced data from end users, preprocessed data, post processed data generated by various modules of the system 100. It can also include the knowledge graphs, ontologies etc., required by the preprocessing module for performing preprocessing and post processing module for performing post processing. The memory 102, may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure. In an embodiment, the database 108 may be external (as shown in FIG. 2) to the system 100 and coupled via the I/O interface 106. Functions of the components of the system 100 are explained in conjunction with FIG. 2 through 4.
[039] FIG. 2 depicts the overall architecture and process flow of the system 100 for generating AV dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[040] The system 100 (also depicted by reference numeral (6) in the architecture) is built around pre-trained Large Language Models (LLM), capable of generating logical scenarios in standard formats such as OpenScenario® from natural language specifications of any road user input. Any LLM or similar models well known in the literature can be used. Now referring to FIG. 2, the natural language generative AI model, or the LLM, is initially fine-tuned by providing a few shots of pairs of natural language scenario descriptions as input, along with corresponding OpenScenario® content. The fine-tuned pretrained LLM sources the driving scenarios (1) (for example user shared edge case or any driving scenarios accessed from external sources or databases) after being preprocessed by a preprocessing model (2). Subsequently, pretrained LLMs are queried by providing only the natural language scenario description, producing, for example, OpenScenario® content. The pre-processing module (2) contextualizes the input to the LLM using heuristic techniques. The post-processing module (3) is essential as it performs necessary transformations based on scenario analysis and synthesis techniques that understand the syntax and semantics of a logical scenario to resolve potential errors in the LLM output. Scenario validation (4) and feedback loop (5) integration can further improve the quality with more usage. The preprocessing techniques, the transformation approaches, post processing, scenario validation are explained with use case example in conjunction with method steps of FIG. 4
[041] Sourcing Driving scenarios ( e.g., Edge Case Input/ non- trivial driving scenarios): During sourcing of the input (edge cases, non-trivial driving scenarios from crowd sourcing of external databases), the system 100 offers various input modalities via different user end device such as mobile device, smart phone etc. Approaches for encouraging user interaction, by maximizing the exposure and attracting prospective users to contribute driving experiences by narrating, texting driving scenarios is listed below. Techniques well known in the art can be used to convert the user input received via one or more input modalities to describing the edge case scenario into an input text scenario.
• Text input through a browser-based application, mobile application, or other applications allows any road user to interact
• Voice input through a process that converts into compatible text format making it more accessible.
• Video input with compatible text input augments the video creating an enriching experience.
• Mixed input of video, augmented with text and corresponding audio information, allows for a more accurate representation of the edge case description.
• Text input generated from existing data sources such as accident and insurance databases gets translated into a compatible text format.
• The lowest level contains the text input, which allows for a more extensible and scalable system, which is especially important as increasingly di?erent input modalities are supported.
• Natural language description of the driving scenario such as the edge case can make for a great user experience.
• Multilingual LLM allows text input in more natural languages as the capability of LLM continues to evolve, opening up to cultures and regions in a global world.
[042] In capturing the driving experience using multi-modal inputs, the objective is to create detailed textual descriptions encompassing various aspects such as visual scenes, auditory cues, and contextual information. The techniques collectively enable the creation of detailed textual representations that encapsulate the driving experience across various modalities.
1) Image Recognition: Utilizing convolutional neural networks (CNNs) to analyze images captured from the driver's perspective, extracting features like road conditions, surrounding vehicles, traffic signs, and landmarks. These visual cues can be translated into detailed textual descriptions, providing insights into the driving environment.
2) Speech Recognition: Incorporating speech recognition technology that uses end-to-end deep learning approaches and the like to transcribe the driver's spoken observations, comments, or reactions during the driving experience. It includes capturing verbal descriptions of road conditions, weather, traffic situations, and any notable events encountered along the way.
3) Audio-to-Text Conversion: Converting audio signals, using state of the art models such as Transformer-based architecture and the like. from the vehicle's surroundings, such as engine sounds, honking, sirens, or ambient noise, into text descriptions. It can provide additional context about the driving environment, including other vehicles, pedestrians, or emergency vehicles.
4) Video Summarization: Analyzing video footage from in-car or dashboard cameras to extract keyframes and scenes, identifying essential visual elements such as lane markings, road signs, traffic flow, and interactions with other road users. It includes translating the visual cues into detailed textual descriptions of the driving experience. State of the art models such as BiLSTM combined with attention and the like can be used for video summarization.
5) Fusion Models: Integrating information from multiple modalities using models proposed in the literature such as interpretable heterogeneous ensembles, and the like to integrate information from images, audio, and video, using fusion models to generate comprehensive textual representations of the driving experience. By combining visual, auditory, and contextual information, these models can create rich and detailed descriptions that capture the essence of the journey.
[043] Pre-processing of Input: The pre-processing module (2) transforms the driving scenarios described in natural language text into a contextualized intent-aware query programmatically applied to the finetuned natural language generative AI (Gen AI) model such as LLM. The preprocessing steps are described in detail in conjunction with FIG. 4 The finetuned language model leveraged by the pre-processing module is a distilled version of the Pre-trained LLM (PTLLM) and is much smaller and cheaper to run. Instruction prompting finetunes the PTLLM with high-quality tuples of (scenario description, ground-truth scenario output). Several heuristic techniques increase the probability of getting the desired result given input, underscoring the pre-processing module's utility.
[044] Preprocessing enables to, best leverage and benefit from pretrained LLM capabilities,. The preprocessing effectively aligns with user intent and generate expected outcomes ( driving scenario in standard driving format) from the pretrained LLM.
[045] Post-processing of Output: Generally, there is no guarantee that the output of the finetuned large language model will always follow the machine-readable scenario format resulting in a valid driving scenario. The post-processing module performs output validation at both syntactical and semantic levels, checking the correctness of the structured fields and their values in a given scenario context. A set of inferred transformations help resolve common and recurring errors in the output generated by the finetuned large language model to produce valid scenarios.
• Scenario correctness testing is the process of executing syntax and grammar checker tools to find errors and is aimed primarily at improving quality. Once the machine-readable scenario is generated, for example say in OpenScenario® format, then the system 100 executes the Syntax Checker and Grammar Checker tools to find errors - the purpose of running such tools is to check the scenario correctness.
• Recognizing patterns allows us to assess the appropriateness and identify common and recurring errors in the generated machine-readable scenarios, leading to more transformations.
• Lexical analysis of machine-readable scenarios generating tokens when parsed into an Abstract Syntax Tree (AST) describes a parse tree logically. The ASTs can be updated to fix inconsistencies or enhanced with information such as properties and annotations and transformed into valid machine-readable scenarios.
• The AST representation of the machine-readable scenario is suitable when performing System Theoretic Process Analysis (STPA), a safety and hazard analysis method to evaluate the existence of driving scenario. The effectiveness evaluation for driving scenario is a qualification before including them in the AV dataset (included within database 108).
• A corpus of identified errors classified into categorical error types can provide valuable insights into methodological approaches to resolving or addressing them.
[046] Scenario validation module (4) and Feedback Loop (5): Learning from human preferences has emerged as a powerful paradigm for measuring the performance of finetuned large language models (LLMs) and using the feedback as a loss to optimize the model. Scenario validation module learns from user interactions to improve its accuracy, quality, and reliability. Finetuning the language model with reinforcement learning can improve the system 100 by assimilating user feedback.
• The system 100 actively learns from usage by incorporating user feedback into the processing modules.
• User scoring low on the quality of the generated machine-readable scenarios indicates the need for incremental finetuning and/or updating the pre-processing strategies and/or updating the post-processing transformations.
• Users assigning controversial scores on the text input can help find the correlation between a low-quality machine-readable scenario or failure to generate a machine-readable scenario and its controversialness.
• Users evaluating the machine-readable scenario to provide an intent and context score indicates the degree of ambiguity in the corresponding input text or feedback for improving the components (1) and (2) of the system 100.
• Users can interact with the AV dataset to enhance the quality of the machine-readable scenarios.
[047] An evaluation framework described in conjunction with FIG.4 facilitates regression testing and helps evaluate the finetuned LLM. The output of LLM is evaluated for accuracy, relevance of responses, ability to recall and precisely apply context, and relevance of provided context.
[048] FIG. 3 illustrates the downstream integration of the system of FIG. 1 for generating simulation environments for training AVs, in accordance with some embodiments of the present disclosure.
[049] The natural language generative AI model or the LLM, converts driving scenario description in natural language format (8), translating them into scenario formats compatible with running in a simulation-based validation toolchain (9), continue to improve in performance as the LLMs get larger, more powerful, and versatile. The proposed disclosure also enables participation by anyone, anywhere, to contribute to the driving scenario database referred to as the AV dataset (database 108) through a natural language interface infinitely scaling as more road users engage across geographies with an enormous diversity of driving conditions around the world. The Application Programming Interface (API) (7) around the system 100 opens up the possibility for other future integrations of the system 100 beyond generating and storing the AV dataset. This enables co-innovation, and advance further innovation with system 100 as the base.
[050] The AV industry is doubling on simulation as it gets closer to commercialization. Enabling smart validation to scale the necessary level of validation of the AV software to recognize object detection and decision-making in handling driving scenarios paves the way to more sustainable, cost-effective, and safer autonomy. Thus, the system 100 provides comprehensive AV dataset necessary for a coverage-driven cognitive AV verification and validation framework. It helps validate the functional correctness of AD functions and takes an open data-driven approach at scale to building safer autonomy. As depicted in FIG. 3, the system 100 also works across multifaceted testing environments spanning simulation, closed test tracks, and on-road. The smart validation tools helps the user decide which scenarios to test and how to test it (cloud-based simulation, on a proving ground, public roads) while ensuring maximum coverage when given a specific AV objective. The system 100 provides the architecture in a way to enable ease in scaling up and quickly testing many scenarios. The intent is to produce scenarios for all the possible driving scenarios where the AV software is more likely to fail, typically including the "edge cases or the non-trivial driving scenarios" The system 100 thus contributes to ensure driverless vehicles' safety, wherein the edge case scenario database or the AV dataset (10) captures critical information to provide maximum test coverage.
[051] The synthesized logical scenarios from any road user's natural language input using the system 100 contribute to the driving scenario database (108). Any simulation tools, referred to as smart validation tools as shown in FIG. 3, can then provide simulator-agnostic, cost-effective, safe, faster, efficient, and sustainable ways to verify and validate logical scenarios (9) generated by the system 100 (6).
[052] FIG. 4 illustrates a flow diagram of a method 400 for generating Autonomous Vehicle (AV) dataset by processing sourced driving scenarios, in accordance with some embodiments of the present disclosure.
[053] In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 400 by the processor(s) or one or more hardware processors 104. The steps of the method 400 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and FIG. 2 and the steps of flow diagram as depicted in FIG. 4. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[054] Referring to the steps of the method 400, at step 402 of the method 400, the one or more hardware processors 104 are configured by the instructions to acquire a plurality of driving scenarios in a natural language via a plurality of multimodal inputs from a plurality of end user devices. Each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario wherein the input text scenario is an unstructured data format. The acquisition of the driving experiences via multimodal inputs and conversion to natural language text is as explained in the description of FIG. 1.
[055] At step 404 of the method 400, the one or more hardware processors 104 are configured by the instructions to preprocess via at least one of the domain aware Large Language Model (LLMs), using the ontologies and the Knowledge Graphs, the input text scenario associated with each driving scenario to generate a structured data format for each driving scenario. This processed data in structured data format contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format. The preprocessing steps are listed below and later explained with help of use case:
1. Extracting a plurality of scenario attributes comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act.
2. Generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions.
3. Obtaining structural transformation of the structured data by adhering to schema of the standard scenario format.
4. Performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario.
5. performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision.
6. Identifying factual content within a narrative of the input text scenario and converting it into the machine-readable driving scenario, which is deterministic driving scenario.
7. Enhancing poorly described scenario in the input text scenario by applying the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed..
8. Applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative;
9. Applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format.
10. Removing personally identifiable information (PII).
11. Identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios.
12. Detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases.
[056] Preprocessing combines the contextual language understanding of domain aware LLMs, that can be accessed via APIs from the cloud environment, with structured insights from ontologies and knowledge graphs tailored for extracting information from AV domain narratives efficiently. Following unique preprocessing steps are applied to extract Named Entities (NEs) from user-provided driving scenario narration (using the provided example) in the AV domain.
1. Identify Domain-specific Entities: Determine entities relevant to the AV domain, such as weather conditions ("heavy downpour"), obstacles ("sandbags," "storm drain"), actions ("avoid," "maneuver," "almost collided"), and objects involved ("lane," "vehicle").
2. Develop or Use an Ontology: Create or utilize an existing AV domain ontology that categorizes entities (e.g., Weather Conditions, Obstacles, Maneuvers, Traffic Participants) and defines their relationships.
3. Fine-tune an LLM for NER: Using the ontology, fine-tune a pre-trained Large Language Model on annotated driving scenario datasets, tagging each entity according to the ontology.
4. Integrate Knowledge Graph: Use a knowledge graph to add context and refine understanding, encoding relationships and attributes of entities like actions and objects relevant to driving scenarios.
5. Process the Scenario:
a. Preprocess the text for normalization and segmentation.
b. Extract NEs using the fine-tuned LLM, identifying and categorizing entities based on the text and ontology.
c. Refine and Enrich entity recognition using the knowledge graph for context, such as inferring the significance of "sandbags" near "storm drains."
6. Evaluate and Iterate: Measure the system's accuracy against a manually annotated benchmark, iterating on model training, ontology, and knowledge graph adjustments to improve recognition performance.
[057] Example input text scenario:
“The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane.”
[058] The extracted plurality of scenario attributes for the above example are listed and explained below:
• Entity Attributes:
o Vehicles: The scenario involves at least two vehicles: the narrator's vehicle attempting to maneuver and the vehicle in the adjacent lane, which almost collided. These vehicles' models, sizes, appearances, and types are not specified but are central to assessing risk and required response.
o Obstacles: Sandbags near a storm drain blocking the lane are mentioned as obstacles. Their size, appearance, and type are critical for understanding how they might affect vehicle dynamics and decision-making.
• Environmental Conditions:
o Weather Conditions: The scenario mentions a respite from heavy rain, implying wet road conditions and possibly reduced visibility or traction.
o Time of Day: Not explicitly mentioned but given the visibility of the storm drain and sandbags, it could be inferred as daylight or dusk.
o Road Conditions: Wet surfaces are implied due to recent rain, affecting vehicle dynamics such as braking distance and maneuverability.
• Traffic Conditions:
o Traffic Flow: The mention of an adjacent lane suggests multi-lane roads with unspecified traffic density. The ability to perform a maneuver implies moderate traffic.
o Traffic Rules: Not explicitly mentioned, but lane usage and avoidance maneuvers indicate a scenario where lane discipline and right-of-way rules are in effect.
• Driver Behavior:
o Driver Profiles: The narrator attempts an avoidance maneuver, suggesting a cautious or normal driving style concerned with safety.
o Human Factors: The decision to avoid sandbags and the subsequent near-collision indicate a situation where reaction time and decision-making are critical.
• Scenario Dynamics:
o Event Triggers: The sandbags blocking the lane act as an event trigger, causing the narrator to attempt a maneuver.
o Actions and Maneuvers: The scenario includes a lane change attempt as a maneuver to avoid an obstacle, leading to a near-collision scenario.
• Infrastructure:
o Road Geometry: Implied is a multi-lane road with at least one lane blocked by sandbags. The exact configurations, intersections, or curves are not detailed.
o Signage and Markings: Not mentioned, but potentially relevant for understanding the legality and safety of the maneuver attempted.
• Sensors and Communication:
o Sensor Models: Not mentioned, but in the context of autonomous or semi-autonomous vehicles, the ability to detect sandbags, wet road conditions, and nearby vehicles would be crucial.
o V2X Communication: Not mentioned, but communication with infrastructure could have potentially alerted the vehicle to the hazard ahead or the presence of the sandbags.
• Story and Act:
o Story: The sequence of events starts with a respite from heavy rain, identification of an obstacle (sandbags), and an attempted avoidance maneuver that leads to a near-collision scenario.
o Act: The act involves the detection of the obstacle, decision-making process to attempt a maneuver, and the execution of the maneuver with the near-miss outcome.
[059] The text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions for above example is provided below:
Text Cleaning:
i. Remove Irrelevant Information: Eliminate parts of the text that don't contribute to the scenario description (e.g., greetings, irrelevant details).
ii. Standardize Terminology: Ensure consistent use of terms for vehicle actions, road types, weather conditions, etc.
b. Original Text: "The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane."
c. Cleaned Text: "Attempted to avoid sandbags near a storm drain blocking the lane; maneuver almost resulted in a collision with the side of a vehicle in the adjacent lane."
Entity Recognition:
i. Identify Entities: Use Named Entity Recognition (NER) to identify and categorize key elements such as vehicles, pedestrians, traffic signs, and environmental conditions.
ii. Tag Entities: Tag identified entities appropriately to differentiate between different actors (e.g., Car 1, Pedestrian A).
Entities Identified:
iii. Ego Vehicle (EV): The vehicle attempting the maneuver.
iv. Obstacle: Sandbags near a storm drain.
v. Other Vehicle (OV): Vehicle in the adjacent lane.
Action and Event Identification
i. Extract Actions: Identify verbs or phrases that describe actions (e.g., turning, accelerating) and link them to the correct entities.
ii. Sequence Events: Determine the sequence of events and actions. It's crucial for the scenario's logical flow.
a. Actions:
i. Ego Vehicle: Attempting avoidance maneuver.
ii. Other Vehicle: Present in adjacent lane.
b. Events:
i. Avoidance: Ego Vehicle near collision with Other Vehicle.
Parameter Extraction:
i. Quantify Descriptions: Convert qualitative descriptions into quantitative parameters (e.g., speed in mph or km/h, distances in meters).
ii. Identify Conditions: Extract conditions like weather or lighting and convert them into OpenSCENARIO®-compatible parameters.
Quantified Descriptions:
i. Speeds, distances, and positions would need to be specified but are not given in the narrative.
Identified Conditions:
i. Weather: Post-downpour conditions, potentially wet road surface.
[060] Obtaining structural transformation of the structured data by adhering to schema of the standard scenario format.
a. Template Mapping: Map the cleaned and structured text data to OpenSCENARIO® elements and attributes. This could involve using templates for common scenario types.
b. XML Generation: Convert the mapped data into XML, adhering to the OpenSCENARIO® schema. This involves creating elements for actions, events, conditions, and so on, according to the schema's structure.
OpenSCENARIO® Elements:
i. Storyboard: Describes the sequence of events.
ii. Actions: Ego Vehicle's avoidance maneuver.
iii. Events: The near-collision event.
iv. Entities: Ego Vehicle and Other Vehicle.
v. Environment: Weather conditions affecting the road.
[061] Performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario.
• Schema Validation: Use XML schema validation tools to ensure the generated OpenSCENARIO® file adheres to the standard schema.
• Logical Validation: Review the scenario to ensure it makes logical sense and accurately represents the intended driving situation.
• Schema Validation: Ensure the XML structure adheres to OpenSCENARIO® standards.
• Logical Validation: Ensure the scenario reflects the described event accurately.
[062] Performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision.
Context and intent recognition:
a. By using contextual embeddings, the system 100 achieves a nuanced understanding of natural language narratives, accurately interpreting and transforming them into formalized driving scenarios with technical precision.
i. Embedding Model Utilization: Employs LLMs pre-trained on extensive corpora and fine-tuned on driving-related datasets. This fine-tuning customizes the contextual embeddings to the driving domain, enhancing the model’s capability to discern driving-specific terminology and actions.
ii. Contextual Disambiguation: Leverages the fine-tuned embeddings to perform word sense disambiguation, enabling the system 100 to correctly interpret words with multiple meanings based on driving context (e.g., 'right' as a direction versus a legal right-of-way).
iii. Sequential Context Processing: Applies sequential context understanding through the embeddings to map the order of narrative events, essential in constructing the temporal sequence of driving actions (indicating before turning).
iv. Attention Mechanism Integration: Integrates attention mechanisms that enable the model to focus on parts of the narrative crucial for driving decisions, prioritizing terms indicative of maneuvers or hazards.
v. Contextual Relevance Assessment: Utilizes the embeddings to score the relevance of terms within the driving context, determining their impact on the scenario (e.g., the significance of 'avoided collision' versus 'drove past').
vi. Scenario Mapping: The embeddings inform the mapping process, where natural language descriptions are converted into OpenSCENARIO elements, ensuring the preservation of semantic nuances in the structured driving scenario.
[063] A user inputs the narrative, 'The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane.' The system 100 processes this text to understand 'sandbags' as a static obstacle, 'storm drain' as a road feature, and 'maneuver' as an avoidance action. The intent classification discerns a near-collision event to be avoided. The stochastic model evaluates the likelihood of collision based on the narrative and selects the most probable outcome for the scenario. A scenario construction engine creates an OpenSCENARIO™ file where the ego vehicle performs a lateral maneuver to avoid a collision, consistent with the narrative. This file is then used to simulate the described scenario, ensuring the accuracy of the context and intent capture."
[064] Identifying factual content within a narrative of the input text scenario and converting it into the machine-readable driving scenario, which is deterministic driving scenario.
a. This method 400 integrates natural language processing (NLP), natural language understanding (NLU), contextual analysis, and machine learning to identify factual content within narratives and convert them into deterministic, machine-readable scenarios.
i. Parsing and Understanding: Applies known NLP techniques to tokenize the narrative and NLU to identify key components such as entities, actions, and contextual details.
ii. Contextual Analysis: Uses domain-specific machine learning models to discern the driving scenario's context, distinguishing between subjective and objective elements.
• Factual Extraction: Implements algorithms to isolate factual information from subjective expressions, focusing on essential details for scenario construction. Factual analysis employ various algorithms and approaches from natural language processing (NLP) and text analysis to extract factual information and isolate subjective expressions from the scenario narration provided. Here are some example algorithms and techniques that can help with this task:
• Named Entity Recognition (NER): Using NER to identify and extract essential entities such as "heavy downpour," "sandbags," "storm drain," "lane," and "vehicle." It can help in focusing on factual details related to the driving scenario.
• Sentiment Analysis: Applying sentiment analysis to identify subjective expressions such as "much-needed respite" and "almost collided." It will help in isolating subjective opinions or emotions expressed in the narration.
• Part-of-Speech (POS) Tagging: Utilizing POS tagging to distinguish between subjective words (e.g., adjectives, adverbs) and factual information (e.g., nouns, verbs). It can help filter out subjective expressions and focus on essential details.
• Dependency Parsing: Employing dependency parsing to analyze the sentence's syntactic structure and identify relationships between words. It can help objectively understand the actions and events described in the narration.
• Keyword Extraction: Using keyword extraction techniques to identify and prioritize essential terms and phrases related to the driving scenario, such as "avoid sandbags," "storm drain," and "collided with a vehicle." It can assist in extracting essential details for scenario construction.
• Text Summarization: Apply text summarization techniques to concisely summarize the driving scenario, focusing on factual information while minimizing subjective expressions. It can help in extracting the most critical details for scenario analysis.
iii. Transformation Logic: Develops a logic for mapping natural language components to OpenSCENARIO elements, ensuring accurate representation of the driving scenario.
iv. Consistency and Determinism Checks: Validates the OpenSCENARIO output for schema compliance and logical consistency, ensuring deterministic outcomes.
v. Iterative Refinement: Refines the conversion process through iterative feedback from evaluation, enhancing scenario accuracy and realism.
b. For the input "The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane":
i. Parsing and Understanding: Identifies "sandbags" as an obstacle, "attempted to avoid" as an action, and "heavy downpour" as contextual detail.
ii. Contextual Analysis & Factual Extraction: Focuses on the maneuver to avoid sandbags, disregarding subjective expressions like "much-needed respite".
iii. Transformation Logic: Maps the narrative to OpenSCENARIO® elements: for sandbags, for avoidance, and for the near-collision scenario.
iv. Consistency and Determinism Enforcement: Ensures the output aligns with OpenSCENARIO® standards and the narrative's logical flow.

[065] Enhancing poorly described scenarios: A streamlined approach ensures accurate, efficient conversion of descriptive driving scenarios into structured OpenSCENARIO format.
i. Preprocessing and Normalization: Text inputs are standardized through lowercasing, punctuation removal, and tokenization to reduce variability and facilitate processing.
ii. Entity Recognition and Extraction: A domain-specific NLP model identifies driving-related entities (vehicles, obstacles, maneuvers) from the normalized text. For example, "sandbags," "storm drain," and "maneuver" are tagged appropriately in the input scenario.
iii. Contextual Analysis and Relationship Mapping: The system 100 analyzes sentence structure to understand spatial and temporal relationships between entities, determining that the "maneuver" was an avoidance action near "sandbags" and a "storm drain" with a near-collision in the "adjacent lane."
iv. Validation and Correction: A validation module applies driving scenario standards to ensure logical consistency and completeness. It prompts for additional information if necessary, ensuring descriptions like the maneuver and near-miss event are adequately detailed.
v. Transformation to OpenSCENARIO® Format: Validated entities and relationships are mapped to OpenSCENARIO® elements, creating a structured representation of the scenario, including avoidance maneuvers and near-miss events.
vi. Optimization and Learning Feedback Loop: The system 100 learns from user corrections and expert feedback, enhancing entity recognition and processing capabilities for future scenarios.
[066] Applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative: A streamlined approach ensures accurate, efficient conversion of descriptive driving scenarios into structured OpenSCENARIO® format.
i. Preprocessing and Normalization: Text inputs are standardized through lowercasing, punctuation removal, and tokenization to reduce variability and facilitate processing.
ii. Entity Recognition and Extraction: A domain-specific NLP model identifies driving-related entities (vehicles, obstacles, maneuvers) from the normalized text. For example, "sandbags," "storm drain," and "maneuver" are tagged appropriately in the input scenario.
iii. Contextual Analysis and Relationship Mapping: The system 100 analyzes sentence structure to understand spatial and temporal relationships between entities, determining that the "maneuver" was an avoidance action near "sandbags" and a "storm drain" with a near-collision in the "adjacent lane."
iv. Validation and Correction: A validation module applies driving scenario standards to ensure logical consistency and completeness. It prompts for additional information if necessary, ensuring descriptions like the maneuver and near-miss event are adequately detailed.
v. Transformation to OpenSCENARIO® Format: Validated entities and relationships are mapped to OpenSCENARIO® elements, creating a structured representation of the scenario, including avoidance maneuvers and near-miss events.
vi. Optimization and Learning Feedback Loop: The system 100 learns from user corrections and expert feedback, enhancing entity recognition and processing capabilities for future scenarios.

[067] Applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format.
a. A method and system for transforming intricate natural language descriptions of driving scenarios into OpenSCENARIO® format, leveraging a Chain-of-Thought (CoT) prompting strategy. This approach dissects narratives into logical steps, facilitating the extraction and structuring of scenario details into a machine-readable format, detailing environmental conditions, obstacle types and locations, vehicle actions, and potential hazards, enabling accurate scenario representation.
i. Input Reception: Receives a narrative, e.g., "After a heavy downpour, attempting to avoid sandbags near a storm drain, I nearly collided with a vehicle in the adjacent lane."
ii. CoT Prompting: Utilizes CoT to decompose the narrative into elements such as environmental conditions, obstacles, actions, and outcomes.
iii. Detail Extraction: From each logical step, extracts specifics (e.g., obstacle types, maneuvers) that align with OpenSCENARIO® parameters.
iv. Scenario Structuring: Constructs an OpenSCENARIO® document by mapping extracted details to corresponding entities, attributes, and actions in the format.
v. Output Generation: Produces an OpenSCENARIO® document encapsulating the original narrative.
b. For the provided scenario, the system 100:
i. Identifies the environmental condition ("heavy downpour"),
ii. Notes the obstacle ("sandbags near a storm drain"),
iii. Recognizes the action ("maneuver to avoid") and
iv. Acknowledges the outcome ("near collision with adjacent vehicle").
[068] Removing personally identifiable information (PII).
a. To safeguard personally identifiable information (PII) in text inputs while transforming natural language narratives into structured formats like OpenSCENARIO, a concise, innovative technical workflow integrates Personal Information Detection (PID), balancing technical sophistication with privacy preservation.
i. Text Normalization: Input text is standardized through normalization—adjusting case, eliminating extraneous spaces, and rectifying common typos—to prepare for precise PII detection.
ii. PID via NLP Models: Advanced NLP models, equipped with Named Entity Recognition (NER) capabilities, scan the normalized text to identify potential PII, leveraging context and entity recognition to pinpoint information such as names, addresses, and other identifiers.
iii. Contextual Analysis for PII Confirmation: A secondary, contextual analysis distinguishes actual PII from false positives. This step employs algorithms to understand narrative context, ensuring accurate identification by relating detected entities to nearby indicative keywords or phrases. Several approaches are employed to analyze the scenario narration and distinguish actual personally identifiable information (PII) from false positives while also understanding the contextual narrative.
• Named Entity Recognition (NER):
Use NER to identify entities by identifying names of individuals, locations, dates, and organizations.
Example: Identify “storm drain” and “lane” as contextual keywords and use them to contextualize nearby entities.
• Keyword Proximity Analysis:
Analyze the proximity of detected entities to specific keywords or phrases related to the driving scenario.
Example: Analyze the proximity of entities like “vehicle,” “collision,” “lane,” etc., to identify relevant PII.
• Contextual Clustering:
Use of group-detected entities based on their semantic similarity and context within the narrative.
Example: Group entities related to driving maneuvers, road conditions, and vehicles to understand the context better.
• Sentiment Analysis:
Analyze the sentiment of the narrative to identify phrases or keywords indicating personal experiences or emotions.
Example: Determine the sentiment around phrases like “much-needed respite” to understand the emotional context.
iv. Anonymization or Redaction of PII: Identified PII undergoes anonymization or redaction, replaced with generic placeholders, or removed, based on the requirement to maintain narrative integrity for further processing.
v. Conversion to Structured Format: The sanitized narrative is finally transformed into a structured format like OpenSCENARIO®, employing algorithms to parse and map narrative elements (actions, entities, conditions) to structured language constructs.
b. In the scenario, “The other day, after a respite from heavy downpour, I attempted to avoid sandbags near a storm drain, almost colliding with a vehicle in the adjacent lane,” the process involves:
i. Normalization: Adjusting the narrative’s format for consistency.
ii. PID via NLP Models: Scanning for PII; in this case, none is present.
Transformation: Parsing the scenario to extract elements like the environmental condition (“heavy downpour”), obstacles (“sandbags”), and actions (“avoid” and “almost colliding”), then structuring these details in OpenSCENARIO® format, ensuring accurate depiction without PII exposure.
[069] Identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios.
[070] Using search and recommendation techniques to identify similar scenarios
a. Driving scenarios are ranked based on their similarity scores ( for example, cosine similarity score) in descending order using custom word embeddings trained on a large corpus of driving scenario narrative text data. Higher similarity scores indicate greater similarity between scenarios.
b. Custom word embeddings are transformed into scenario embeddings using simple aggregation, pooling, or transformer-based models. Pooling techniques, such as max-pooling or average-pooling, over the word embeddings allow the scenario embedding model to focus on important features while reducing the dimensionality of the embeddings. Alternatively, transformer-based scenario embedding models are initialized with custom word embeddings and fine-tuned with the driving scenario text dataset. The trained scenario embedding model generates fixed-length vector representations based on the custom embeddings represented in a multi-dimensional space.
c. Using custom word embeddings allows to capture domain-specific or task-specific semantics, resulting in more contextually relevant and accurate scenario embeddings. Word embeddings capture complex associations with other words, connected through various relations such as synonymy, antonymy, similarity, relatedness, and connotation. Using word embeddings in generating scenario embeddings requires considering multiple factors, including the context window, similarity measurement, and potential biases.
d. The custom word embeddings capture the meaning based on linguistic distribution in a multi-dimensional space. In vector semantics, meaning is represented as a point in space, where each word is a vector. Words with similar meanings are located near each other in this semantic space, automatically constructed by analyzing word proximity in text. Representing word meaning as vectors allows for generalization to similar but unseen words.
e. The similarity between scenarios represented as fixed-length vectors is calculated using cosine similarity. The cosine of the angle between two vectors is measured after normalizing the dot product, resulting in a value between -1 and 1. A cosine similarity of 1 indicates the similarity or relatedness between vectors. Now that similarity scores for each scenario compared to other scenarios are available, they are ranked in decreasing order based on their frequency in the dataset.
[071] Detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases for unbiased conversion of natural language driving scenarios into structured OpenSCENARIO format, leveraging custom algorithms for bias detection and mitigation during processing.
i. Pre-processing and Normalization:
1. Objective: Standardizes natural language narratives to remove informal expressions and standardize terminology.
2. Example: Transforms “much-needed respite from the heavy downpour” to “break in heavy rain.”
ii. Bias Detection:
1. Objective: Employs algorithms to identify cultural, linguistic, or contextual biases within the narrative.
2. Example: Analyzes emphasis and sentiment to flag potential biases in scenario description.
iii. Bias Mitigation:
1. Objective: Adjusts narrative to neutralize identified biases, ensuring objective and balanced scenario representation.
2. Example: Re-words narrative elements to remove undue emphasis, such as describing avoidance maneuvers without subjective bias.
iv. Conversion to OpenSCENARIO® Format:
1. Objective: Maps bias-mitigated narrative to OpenSCENARIO® entities and actions accurately, utilizing a structured ontology.
2. Example: Translates “attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane” into corresponding scenario elements like obstacles (sandbags), environmental conditions (storm drain), and vehicle dynamics (avoidance maneuver).
v. Post-conversion Validation:
1. Objective: Ensures the converted scenario accurately reflects the original narrative without bias.
Example: Verifies that the scenario’s representation of the avoidance maneuver and potential collision is objective and matches the narrative’s intent.
[072] At step 406 of the method 400, the one or more hardware processors 104 are configured by the instructions to process, by the pretrained LLM the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario.
[073] For example, LoRA, or Low-rank adaptation, is chosen for fine-tuning the Large Language Model (LLM) in converting driving scenario narration to structured OpenSCENARIO format due to its efficiency, preservation of pre-trained knowledge, task-specific adaptation capabilities, plug-and-play adapters, and scalability. By selectively modifying key parameters while freezing pre-trained weights, LoRA optimizes resource usage, maintains model generality, tailors model behavior to the task, facilitates seamless integration of adapters, and enables versatile application across various domains, making it an ideal choice for adapting LLMs to specific tasks like converting driving scenario narration to OpenSCENARIO® format.
[074] At step 408 of the method 400, the one or more hardware processors 104 are configured by the instructions to perform post processing and scenario validation on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario. The discrepancies in the context of a logical scenario herein enhance syntax or semantic errors.
[075] The post processing and scenario validation further comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format.
• Scenario correctness
o To ensure scenario correctness in transforming natural language driving narrations into OpenSCENARIO™ format, a concise, technically process is applied, focusing on syntactical and semantic integrity. Here’s a streamlined approach using the example driving scenario: "The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane":
? Transformation: The narrative is converted into structured OpenSCENARIO™ elements, employing a fine-tuned language model for accurate encoding.
? Syntax Checking: Validates the XML structure against the OpenSCENARIO™ schema, identifying syntax errors such as incorrect attributes or improperly closed tags.
? Grammar Checking: Ensures semantic correctness, verifying that the scenario logic, like maneuver sequences, aligns with expected behaviors and physical laws.
? Post-Processing Validation: This stage refines the scenario by addressing syntactical and semantic discrepancies detected earlier, ensuring that elements like the avoidance maneuver near sandbags are represented accurately and realistically.
? Error Correction and Optimization: Inferred transformations correct and optimize the scenario, ensuring realism and safety, such as adjusting the near-collision maneuver for feasibility.
• Recognizing patterns
o To effectively enhance OpenSCENARIO output for natural language driving scenarios, a streamlined yet sophisticated technical workflow is employed, focusing on pattern recognition, semantic analysis, and robust post-processing validation. Here's a breakdown using the example scenario of avoiding sandbags near a storm drain:
? Structured Data Conversion: Conversion of the narrative into structured OpenSCENARIO format, identifying entities (vehicles, obstacles), actions (avoidance maneuver), and conditions (post-rain scenario).
? Pattern Recognition and Error Detection: Using machine learning to recognize patterns and detect errors such as logical inconsistencies or unfeasible actions, based on the structured scenario data.
? Semantic Analysis: Analyzing the scenario's semantics to ensure actions and events are contextually appropriate and physically plausible within the driving environment.
? Post-Processing Validation:
? Syntactical Validation: Ensuring correct syntax in the OpenSCENARIO output, validating field formats and data types.
? Semantic Validation: Confirming the scenario's actions and events are logical and consistent with real-world driving contexts.
? Error Resolution and Transformation: Automatically correcting identified errors or inconsistencies, adjusting the scenario details to ensure logical and physical plausibility.
? Output Generation: Producing a validated and corrected OpenSCENARIO output that accurately reflects the original narrative while being technically sound.
o In the provided scenario, if the initial transformation inaccurately represents the sandbags' placement, leading to a potential collision, the system 100 identifies this as a semantic inconsistency. It then applies transformations to adjust either the sandbags' position or the avoidance maneuver, ensuring the final scenario is both realistic and aligned with expected driving behaviors. This streamlined process ensures OpenSCENARIO outputs are not only error-free but also enriched with contextually relevant and technically accurate details.
• Lexical analysis of machine-readable
o The lexical analysis of machine-readable format, explained using the example scenario:
? Lexical Analysis: Parse the driving scenario narrative into tokens representing meaningful elements (e.g., actions like "avoid," objects like "sandbags," conditions like "heavy downpour").
? AST Construction: Build an Abstract Syntax Tree (AST) from these tokens, organizing them hierarchically to reflect the scenario's structure, such as actions taken, and entities involved.
? AST Enhancement: Amend the AST to rectify inconsistencies and augment it with additional data like weather conditions or potential hazards, ensuring a comprehensive representation of the scenario.
? Transformation to OpenSCENARIO: Convert the enriched AST into a valid OpenSCENARIO format, mapping abstract elements to specific attributes and elements required for machine readability.
? Post-processing Validation: Employ a post-processing module to validate the structured scenario on syntactical and semantic levels, ensuring its correctness and contextual accuracy.
? Inferred Transformations: Utilize inferred transformations via a finetuned large language model to correct common errors and enhance scenario details, refining the output to accurately reflect the original narrative.
o For the provided example, "The other day, after much-needed respite from the heavy downpour, I attempted to avoid sandbags near a storm drain blocking the lane with a maneuver that almost collided with the side of a vehicle in the adjacent lane," this process involves tokenizing the narrative, organizing the tokens into an AST that logically represents the scenario, enriching this tree with detailed annotations (e.g., specifying the maneuver's nature and safety considerations), and finally, transforming this enriched AST into a validated, machine-readable OpenSCENARIO format. This ensures the nuanced driving scenario is accurately captured in a structured format.
• A corpus of identified errors classified into categorical error types
o To effectively create a corpus of identified errors classified into categorical error types for enhanced OpenSCENARIO output in natural language driving scenario narrations, the approach involves several key steps:
? Error Identification and Classification: Utilizing NLP to parse and categorize errors within the driving narrative, focusing on syntactical errors (e.g., grammar, punctuation) and semantic errors (e.g., logical inconsistencies). For the example, this step would detect any inaccuracies in describing the maneuver to avoid sandbags and the near-collision.
? Syntactical Validation: Implementing a layer in the post-processing module for syntax checks against OpenSCENARIO standards, ensuring narrative coherence and grammatical accuracy. This includes validating the structured narrative of avoiding sandbags and almost colliding with a vehicle.
? Semantic Validation: Using domain-specific rules for logical and contextual correctness, verifying that the scenario (avoiding sandbags and the near-miss) aligns with realistic driving dynamics and OpenSCENARIO parameters.
? Error Transformation and Resolution: Applying inferred transformations based on common errors to automatically correct new outputs. For instance, adjusting the spatial description and action sequences in the provided scenario to resolve identified errors.
? Feedback Loop for Continuous Improvement: Establishing a mechanism to refine error classifications and transformations through analysis of resolved errors, enhancing model accuracy over time.
Integration with OpenSCENARIO®: Ensuring all corrections adhere to OpenSCENARIO® formats and standards, facilitating their application in autonomous driving.
[076] Further, the system 100 comprises the evaluation framework that examines factors such as accuracy, relevance of responses, ability to recall and precisely apply context, and relevance of provided context.

[077] Pre-processing and Post-processing Evaluation:
a. Faithfulness: Does the LLM's output faithfully represent the information retrieved? In this case, are the actions described in the scenario accurately reflected in the generated OpenSCENARIO XML?
b. Answer Relevancy: Are the outputs of the LLM relevant to the scenario presented? This would involve checking if the LLM's generated scenario details align with the user's description.
c. Context Recall: How well does the retrieval system recall relevant information? In this instance, it would be the retrieval of similar driving scenarios or known templates.
d. Context Precision: Does the system retrieve only the relevant context? This means the retrieved information should be closely related to sandbag avoidance and near-collision scenarios.
e. Context Relevancy: Is the retrieved context relevant to the user's query? The retrieved context should directly inform the driving scenario involving sandbags and near-collisions.
[078] Scenario Evaluation:
a. Answer Semantic Similarity: How semantically similar is the LLM's output to what an ideal OpenSCENARIO representation of the user's scenario would be?
b. Answer Correctness: Does the LLM produce a correct OpenSCENARIO output? The output should not only be syntactically correct but also contextually accurate, reflecting the near-collision event as described.
[079] Regression Testing:
a. Evaluation Framework: The evaluation framework facilitates regression testing by providing metrics that allow us to compare the performance of the system before and after changes. If a new version of the LLM exhibits lower performance on any of these metrics compared to a previous version, it would indicate a regression.
b. Fine-tuned LLM Evaluation: By systematically applying these metrics to the fine-tuned LLM's output, one can evaluate whether the fine-tuning has improved the LLM's ability to generate accurate, relevant, and coherent outputs in the context of driving scenarios.
[080] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[081] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[082] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[083] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[084] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[085] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

,CLAIMS:

1. A processor implemented method (400), the method comprising:
acquiring (402), via one or more hardware processors, a plurality of driving scenarios shared in a natural language via a plurality of multimodal inputs associated with a plurality of end user devices, wherein each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario, wherein the input text scenario is an unstructured data format;
preprocessing (404), via the one or more hardware processors, using at least one of domain aware Large Language Model (LLMs), ontologies and Knowledge Graphs, the input text scenario associated with each driving scenario, wherein the preprocessed data provides a structured data format for each driving scenario that contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format, wherein preprocessing on the input text scenario comprises:
extracting a plurality of scenario attributes associated with the input text scenario comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act;
generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions;
obtaining structural transformation of the structured data by adhering to schema of the standard scenario format;
performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario;
performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision;
identifying factual content within a narrative of the input text scenario and converting into the machine-readable driving scenario, which is a deterministic driving scenario;
enhancing poorly described scenario in the input text scenario in accordance with the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed;
applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative;
applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format;
removing personally identifiable information (PII);
identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios; and
detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases;
processing (406), via one or more hardware processors, using the pretrained LLM the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario; and
performing post processing and scenario validation (408), via one or more hardware processors, on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario.
2. The method as claimed in claim 1, wherein the finetuned machine-readable driving scenario generated for each of the plurality of driving scenarios generates an Autonomous Vehicle (AV) dataset to be consumed for training and simulation of AVs.
3. The method as claimed in claim 1, wherein the post processing and scenario validation comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format.

4. A system comprising:

a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more I/O interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:
acquire a plurality of driving scenarios shared in a natural language via a plurality of multimodal inputs associated with a plurality of end user devices, wherein each driving scenario among the plurality of driving scenarios associated with vehicle driving experience is converted to an input text scenario, wherein the input text scenario is an unstructured data format;
preprocess using at least one of domain aware Large Language Model (LLMs), ontologies and Knowledge Graphs, the input text scenario associated with each driving scenario, wherein the preprocessed data provides a structured data format for each driving scenario that contextualizes and aligns for consumption by a pretrained LLM for generating a standard scenario format, wherein preprocessing on the input text scenario comprises:
extracting a plurality of scenario attributes associated with the input text scenario comprising entity attributes, environmental conditions, traffic conditions, driver behavior, scenario dynamics, infrastructure, sensors and communication, and story and act;
generating the structured data format by performing text cleaning, entity recognition, action and event identification, parameter extraction to obtain quantified parameters from qualitative parameters present in the input text scenario and weather conditions;
obtaining structural transformation of the structured data by adhering to schema of the standard scenario format;
performing schema validation and logical validation of the structured data format to ensure accuracy of one or more events described in the input text scenario;
performing context and intent recognition to accurately interpret and transform the input text scenario into a formalized driving scenario with technical precision;
identifying factual content within a narrative of the input text scenario and converting into the machine-readable driving scenario, which is a deterministic driving scenario;
enhancing poorly described scenario in the input text scenario in accordance with the standard scenario format to ensure logical consistency, completeness, prompting additional information ensuring descriptions are adequately detailed;
applying in-context prompting for embedding the narrative within the input text scenario and structured representations for guiding the pretrained LLM to accurately interpret and transform the narrative;
applying chain-of-thought prompting to dissect the narratives into logical steps, facilitating extraction and structuring of the driving scenario details into a machine-readable format;
removing personally identifiable information (PII);
identifying similar driving scenarios based on a similarity score from among a plurality of driving scenarios generated for the plurality on driving scenarios to rank in accordance with frequency of occurrence of similar scenarios; and
detecting cultural, linguistic, and contextual biases within the narrative to adjust the narrative to neutralize identified biases;
process using the pretrained LLM the structured data format obtained after preprocessing to generate a machine-readable driving scenario in the standard scenario format for each driving scenario; and
perform post processing and scenario validation on the machine-readable driving scenario by performing transformations based on scenario analysis and synthesis techniques for detecting syntax and semantics discrepancies in context of a logical scenario within the driving scenario to resolve potential errors in the output of the pretrained LLM to finetune the machine-readable driving scenario for each driving scenario, wherein the pretrained LLM is finetuned using reinforcement learning by incorporating feedback received for the generated machine-readable driving scenario.
5. The system as claimed in claim 4, wherein the finetuned machine-readable driving scenario generated for each of the plurality of driving scenarios generates an Autonomous Vehicle (AV) dataset to be consumed for training and simulation of AVs.
6. The system as claimed in claim 4, wherein the post processing and scenario validation comprises correcting the detected syntax and semantics, error correction and optimization, Lexical analysis of machine-readable driving scenarios in the standard scenario format.

Documents

Application Documents

#	Name	Date
1	202321031887-STATEMENT OF UNDERTAKING (FORM 3) [04-05-2023(online)].pdf	2023-05-04
2	202321031887-PROVISIONAL SPECIFICATION [04-05-2023(online)].pdf	2023-05-04
3	202321031887-FORM 1 [04-05-2023(online)].pdf	2023-05-04
4	202321031887-DRAWINGS [04-05-2023(online)].pdf	2023-05-04
5	202321031887-DECLARATION OF INVENTORSHIP (FORM 5) [04-05-2023(online)].pdf	2023-05-04
6	202321031887-Proof of Right [05-05-2023(online)].pdf	2023-05-05
7	202321031887-FORM-26 [19-06-2023(online)].pdf	2023-06-19
8	202321031887-FORM 3 [03-05-2024(online)].pdf	2024-05-03
9	202321031887-FORM 18 [03-05-2024(online)].pdf	2024-05-03
10	202321031887-ENDORSEMENT BY INVENTORS [03-05-2024(online)].pdf	2024-05-03
11	202321031887-DRAWING [03-05-2024(online)].pdf	2024-05-03
12	202321031887-COMPLETE SPECIFICATION [03-05-2024(online)].pdf	2024-05-03
13	Abstract.1.jpg	2024-06-19
14	202321031887-FER.pdf	2025-11-11

Search Strategy

1	202321031887_SearchStrategyNew_E_SearchStrategyE_08-10-2025.pdf