Device And Method Of Object Based Spatial Audio Mastering

< Back

Device And Method Of Object Based Spatial Audio Mastering

Abstract: The invention relates to an apparatus for generating a processed signal using a plurality of audio objects according to an embodiment, each audio object of the plurality of audio objects comprising an audio object signal and audio object metadata, and the audio object metadata comprising a position of the audio object and a gain parameter of the audio object. The apparatus comprises an interface (110) for the user to specify at least one effect parameter of a processing object group of audio objects, the processing object group of audio objects comprising two or more audio objects of the plurality of audio objects. The apparatus also comprises a processor unit (120) which is designed to generate the processed signal such that the at least one effect parameter specified by means of the interface (110) is applied to the audio object signal or to the audio object metadata of each of the audio objects of the processing object group of audio objects. One or more audio objects of the plurality of audio objects do not belong to the processing object group of audio objects.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

13 May 2021

Publication Number

44/2021

Publication Type

INA

Invention Field

ELECTRONICS

Status

IPRDEL@LAKSHMISRI.COM

Parent Application

Patent Number

Legal Status

Grant Date

2023-12-13

Renewal Date

Applicants

FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Hansastraße 27c 80686 München

Inventors

1. HESTERMANN, Simon

c/o Fraunhofer-Institut für Digitale Medien Ehrenbergstr. 31 98693 Ilmenau

2. SLADECZEK, Christoph

c/o Fraunhofer-Institut für Digitale Medien Ehrenbergstr. 31 98693 Ilmenau

3. SEIDENECK, Mario

c/o Fraunhofer-Institut für Digitale Medien Ehrenbergstr. 31 98693 Ilmenau

Specification

The application relates to audio object processing, audio object encoding and audio object decoding and, in particular, audio mastering for audio objects.

Object-based spatial audio is an approach to interactive three-dimensional audio production. This concept changes not only how content creators or authors can interact with audio, but also how it is stored and broadcast. To make this possible, a new process in the reproductive chain called “rendering” must be established. The rendering process generates loudspeaker signals from an object-based description of the scene. Although recording and mixing have been explored in recent years, concepts for object-based mastering are almost missing. The main difference compared to channel-based audio mastering is that instead of adjusting the audio channels, the audio objects have to be changed. This requires a fundamentally new concept for mastering.

In recent years, the object-based audio approach has generated a lot of interest. Compared to channel-based audio, in which loudspeaker signals are stored as the result of spatial audio production, the audio scene is described by audio objects. An audio object can be viewed as a virtual sound source derived from an audio signal with additional metadata, e.g. B. Position and gain, be available. In order to reproduce audio objects, what is known as an audio renderer is required. Audio rendering is the process of generating loudspeaker or headphone signals based on further information, for example the position of loudspeakers or the position of the listener in the virtual scene.

The process of audio content creation can be broken down into three main parts: recording, mixing, and mastering. While all three steps have been dealt with extensively for channel-based audio over the past few decades, object-based audio will require new workflows in future applications. So far, the recording step generally does not have to be changed, even if future technologies open up new possibilities [1], [2] could bring. The mixing process is a bit different, as the sound engineer no longer creates a spatial mix by panning signals to dedicated speakers. Instead, all audio object positions are generated by a spatial authoring tool, which allows the metadata portion of each audio object to be defined.

Traditional audio mixes route multiple audio tracks to a specific number of output channels. This makes it necessary to create individual mixes for different playback configurations, but enables efficient handling of the output channels during mastering [4] When using the object-based audio approach, the audio renderer is responsible for creating all loudspeaker signals in real time. Arranging a large number of audio objects in a creative mixing process results in complex audio scenes. However, since the renderer can reproduce the audio scene in several different loudspeaker devices, it is not possible to address the output channels directly during production.

To this day, conventional audio production is aimed at extremely specific hearing systems and their channel configuration, for example stereo or surround playback. The decision as to which playback device (s) the content is designed for must therefore be made at the start of its production. The production process itself then consists of recording, mixing and mastering. The mastering process optimizes the final mix to ensure that the same is played back in satisfactory quality on all consumer systems with different speaker characteristics. Since the desired output format of a mix is fixed, the mastering engineer (ME) can create an optimized master for this playback configuration.

The mastering phase makes it sensible for creators to produce audio in suboptimal acoustic environments, as they can rely on a final check of their mix during mastering. This lowers the barriers to entry for the production of professional content. On the other hand, the MEs themselves have been offered a wide range of mastering tools over the years, which has drastically improved their possibilities for corrections and improvements. Nonetheless, the final content is usually limited to the playback device for which it was designed.

This limitation is basically overcome by Object-Based Spatial Audio Production (OBAP). In contrast to channel-based audio, OBAP is based on individual audio objects with metadata that includes their position in an artificial environment, also known as a "scene". Only at the final listening output does a dedicated rendering unit, the renderer, calculate the final loudspeaker signals in real time based on the listener's loudspeaker equipment.

Although OBAP provides each audio object and its metadata individually to the renderer, no direct channel-based adjustments are possible during production, and therefore no existing mastering tools for conventional playback devices can be used. Meanwhile, OBAP requires that all final adjustments be made in the mix. While the requirement to implement overall sound adjustments by manually handling each individual audio object is not only highly inefficient, this circumstance also places high demands on the monitoring equipment of every creator and strictly limits the sound quality of object-based 3D audio content to the acoustic Properties of the environment in which it was created.

Ultimately, developing tools to enable a similarly powerful mastering process for OBAP on the creator side could improve adoption for producing 3D audio content by lowering production barriers and opening up new space for sound aesthetics and quality.

While initial thoughts on spatial mastering have been made available to the public [5], this paper presents new approaches on how conventional mastering tools can be adapted and what types of new tools can be considered helpful in mastering object-based spatial audio. In [5], for example, a basic sequence is described how metadata can be used to derive object-specific parameters from global properties. Furthermore, a concept of an area of interest with a surrounding transition area in connection with OBAP applications is described in [6].

It is therefore desirable to provide improved object-based audio mastering concepts

An apparatus according to claim 1, an encoder according to claim 14, a decoder according to claim 15, a system according to claim 17, a method according to claim 18 and a computer program according to claim 19 are provided.

An apparatus for generating a processed signal using a plurality of audio objects according to an embodiment is provided, wherein each audio object of the plurality of audio objects comprises an audio object signal and audio object metadata, the audio object metadata including a position of the audio object and a gain parameter of the audio object. The device comprises: an interface for specifying at least one effect parameter of a processing object group of audio objects by a user, wherein the processing object group of audio objects comprises two or more audio objects of the plurality of audio objects. Furthermore, the device comprises a processor unit which is designed to generate the processed signal in such a way that the at least one effect parameter, which has been specified by means of the interface is applied to the audio object signal or to the audio object metadata of each of the audio objects of the processing object group of audio objects. One or more audio objects of the plurality of audio objects do not belong to the processing object group of audio objects.

Furthermore, a method for generating a processed signal using a plurality of audio objects is provided, wherein each audio object of the plurality of audio objects comprises an audio object signal and audio object metadata, wherein the audio object metadata comprises a position of the audio object and a gain parameter of the audio object. The procedure includes:

Specification of at least one effect parameter of a processing object group of audio objects by a user by means of an interface (110), wherein the processing object group of audio objects comprises two or more audio objects of the plurality of audio objects. And:

Generating the processed signal by a processor unit (120) such that the at least one effect parameter that has been specified by means of the interface is applied to the audio object signal or to the audio object metadata of each of the audio objects of the processing object group of audio objects.

Furthermore, a computer program with a program code for carrying out the method described above is provided.

The provided audio mastering is based on a mastering of audio objects. In embodiments, these can be positioned anywhere in a scene and freely in real time. In embodiments, for example, the properties of general audio objects are influenced. In their function as artificial containers, they can each contain an arbitrary number of audio objects. Every adjustment to a mastering object is converted in real time into individual adjustments to audio objects of the same.

Such mastering objects are also referred to as processing objects.

Thus, instead of adjusting numerous audio objects separately, the user can use a mastering object in order to carry out mutual adjustments simultaneously on several audio objects.

For example, the set of target audio objects for a mastering object can be defined in a number of ways, according to embodiments. From a spatial perspective, the user can define a custom built scope around the position of the mastering object. Alternatively, it is possible to link individually selected audio objects to the mastering object regardless of their position. The mastering object also takes into account potential changes in the position of audio objects over time.

A second property of mastering objects according to embodiments can be, for example, their ability to calculate on the basis of interaction models how each audio object is individually influenced. Similar to a channel strip, a mastering object can, for example, take any general mastering effect, for example equalizers and compressors. Effect plug-ins usually provide the user with numerous parameters, e.g. B. for frequency or amplification control. When a new mastering effect is added to a mastering object, it is automatically copied to all audio objects in its target set. However, not all effect parameter values are transferred unchanged. Depending on the calculation method for the target set, some parameters of the mastering effect can be weighted, before they are applied to a specific audio object. The weighting can be based on any metadata or a sound characteristic of the audio object.

Preferred embodiments of the invention are described below with reference to the drawings.

The drawings show:

Fig. 1 shows an apparatus for generating a processed signal below

Use of a plurality of audio objects according to an embodiment.

Fig. 2 shows device according to a further embodiment, the device is an encoder before.

Fig. 3 shows device according to a further embodiment, the device is a decoder before.

4 shows a system according to an embodiment.

Fig. 5 shows a processing object with the area A and the fading area

Ar according to one embodiment.

6 shows a processing object with the area A and object radii according to an embodiment.

7 shows a relative angle of audio objects to the processing object according to an embodiment.

Fig. 8 shows an equalizer object with a new radial circumference according to an embodiment

9 shows a signal flow of a compression of the signals from n sources according to an embodiment.

10 shows a scene transformation using a control panel M according to an embodiment.

11 shows the relationship of a processing object with which audio signal effects and metadata effects are brought about, according to one embodiment.

FIG. 12 shows the change in audio objects and audio signals in response to an input by a user according to an embodiment.

13 shows a processing object P0 4 with a rectangle M for distorting the

Corners C · ,, C 2 , C 3 and C 4 by the user according to one embodiment.

Fig. 14 shows processing objects PCh and P0 2 with their respective, overlapping two-dimensional catchment areas A and B according to one embodiment.

15 shows processing object P0 3 with a rectangular, two-dimensional input area C and the angles between P0 3 and the associated sources
S 2 and S 3 according to one embodiment.

16 shows a possible schematic implementation of an equalizer effect applied to a processing object according to an embodiment.

17 shows the processing object P0 5 with a three-dimensional catchment area D and the respective distances dg, dg 2 and dg 3 to the sources Si, S 2 and S 3 assigned via the catchment area according to one embodiment.

18 shows a prototypical implementation of a processing object to which an equalizer has been applied, according to one embodiment.

19 shows a processing object as in FIG. 18, only in a different position and without a transition surface according to an embodiment.

20 shows a processing object with an area defined by its azimuth as the catchment area, so that the sources Src22 and Src4 are assigned to the processing object according to one embodiment

FIG. 21 shows a processing object as in FIG. 20, but with an additional transition area that can be controlled by the user using the “Feather” slider according to one embodiment.

Figure 22 shows several processing objects in the scene, with different ones

Catchment areas according to one embodiment.

Fig. 23 shows the red square on the right side of the picture shows a processing object for horizontal distortion of the position of audio objects according to an embodiment.

Fig. 24 shows the scene after the user has warped the corners of the processing object. The position of all sources has changed according to the distortion, according to one embodiment.

25 shows a possible visualization of the assignment of individual audio objects to a processing object according to an embodiment.

1 shows an apparatus for generating a processed signal using a plurality of audio objects according to an embodiment, wherein each audio object of the plurality of audio objects comprises an audio object signal and audio object metadata, the audio object metadata a position of the audio object and a gain parameter of the audio object include.

The device comprises: an interface 110 for specifying at least one effect parameter of a processing object group of audio objects by a user, wherein the processing object group of audio objects comprises two or more audio objects of the plurality of audio objects.

The device further comprises a processor unit 120 which is designed to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface 110 is applied to the audio object signal or to the audio object metadata of each of the audio objects of the processing object -Group of audio objects is applied

One or more audio objects of the plurality of audio objects do not belong to the processing object group of audio objects.

The device of FIG. 1 described above realizes an efficient form of audio mastering for audio objects.

In the case of audio objects, the problem arises that a large number of audio objects often exist in an audio scene. If these are to be modified, it would be a considerable effort to specify each audio object individually.

According to the invention, a group of two or more audio objects are now organized in a group of audio objects, which is referred to as a processing object group. A processing object group is a group of audio objects that are organized in this special group, the processing object group.

According to the invention, a user now has the option of specifying one or more (at least one) effect parameters by means of the interface 110. The processor unit 120 then ensures that the effect parameter is applied to all two or more audio objects of the processing object group by a single entry of the effect parameter.

Such an application of the effect parameter can, for example, consist in the effect parameter modifying, for example, a specific frequency range of the audio object signal of each of the audio objects of the processing object group.

Or, the amplification parameter of the audio object metadata of each of the audio objects of the processing object group can, for example, be increased or decreased accordingly as a function of the effect parameter.

Or, the position of the audio object metadata of each of the audio objects of the processing object group can be changed accordingly, for example as a function of the effect parameters. For example, it is conceivable that all audio objects of the processing object group are shifted by +2 along an x-coordinate axis, by -3 along a y-coordinate axis and by +4 along a z-coordinate axis.

It is also conceivable that the application of an effect parameter to the audio objects of the processing object group has different effects for each audio object of the processing object group. For example, an axis can be defined as an effect parameter on which the position of all audio objects of the processing object -Group is mirrored. The change in position of the audio objects in the processing object group then has a different effect on each audio object in the processing object group.

In one embodiment, the processor unit 120 can be designed, for example, not to apply the at least one effect parameter that was specified by means of the interface to no audio object signal and no audio object metadata of the one or more audio objects that do not belong to the processing object group of audio objects .

For such an embodiment it is specified that the effect parameter is currently not applied to audio objects that do not belong to the processing object group.

In principle, audio object mastering can either be carried out centrally on the encoder side. Or, on the decoder side, the end user, as the recipient of the audio object scenery, can modify the audio objects himself according to the invention.

An embodiment that realizes audio object mastering according to the invention on the encoder side is shown in FIG.

An embodiment which realizes audio object mastering according to the invention on the decoder side is shown in FIG. 3

FIG. 2 shows a device according to a further embodiment, the device being an encoder.

In FIG. 2, the processor unit 120 is designed to generate a downmix signal using the audio object signals of the plurality of audio objects. The processor unit 120 is designed to generate a metadata signal signal using the audio object metadata of the plurality of audio objects.

Furthermore, the processor unit 120 in FIG. 2 is designed to generate the downmix signal as the processed signal, with at least one modified object signal for each audio object of the processing object group of audio objects being mixed in the downmix signal, the processor unit 120 being formed is, for each audio object of the processing object group of audio objects, the modified object signal of this audio object by means of the application of the at least one effect

Parameter that was specified by means of the interface 1 10 to generate the audio object signal of this audio object.

Or, the processor unit 120 of FIG. 2 is designed to generate the metadata signal as the processed signal, the metadata signal comprising at least one modified position for each audio object of the processing object group of audio objects, the processor unit 120 being designed to generate the modified position of this audio object for each audio object of the processing object group of audio objects by means of the application of the at least one effect parameter, which was specified by means of the interface 110, to the position of this audio object.

Or, the processor unit 120 of FIG. 2 is designed to generate the metadata signal as the processed signal, the metadata signal including at least one modified gain parameter for each audio object of the processing object group of audio objects, the processor unit 120 being designed to generate for each audio object of the processing object group of audio objects the modified amplification parameter of this audio object by means of the application of the at least one effect parameter, which was specified by means of the interface 110, to the amplification parameter of this audio object.

Fig. 3 shows device according to a further embodiment, wherein the device is a decoder. The device of FIG. 3 is designed to receive a downmix signal in which the plurality of audio object signals of the plurality of audio objects are mixed. Furthermore, the device of FIG. 3 is designed to receive a metadata signal, the metadata signal for each audio object of the plurality of audio objects including the audio object metadata of this audio object.

The processor unit 120 of FIG. 3 is designed to reconstruct the plurality of audio object signals of the plurality of audio objects based on a downmix signal.

Furthermore, the processor unit 120 of FIG. 3 is designed to generate an audio output signal comprising one or more audio output channels as the processed signal.

Furthermore, the processor unit 120 of FIG. 3 is designed to apply the at least one effect parameter specified by means of the interface 110 to the audio object signal of each of the audio objects of the processing object group of audio objects in order to generate the processed signal, or to generate it of the processed signal to apply the at least one effect parameter specified by means of the interface 110 to the position or to the gain parameter of the audio object metadata of each of the audio objects of the processing object group of audio objects.

In audio object decoding, rendering on the decoder side is well known to those skilled in the art, for example from the SAOC standard (Spatial Audio Object Coding), see [8]

On the decoder side, for example, one or more rendering parameters can be specified by a user input via the interface 110.

Thus, in one embodiment, the interface 110 of FIG. 3 can, for example, also be designed for the user to specify one or more rendering parameters. The processor unit 120 of FIG. 3 can be designed, for example, to generate the processed signal using the one or more rendering parameters as a function of the position of each audio object of the processing object group of audio objects.

4 shows a system according to an embodiment comprising an encoder 200 and a decoder 300.

The encoder 200 of FIG. 4 is designed to generate a downmix signal based on audio object signals of a plurality of audio objects and to generate a metadata signal based on audio object metadata of the plurality of audio objects, the audio object metadata being a Position of the audio object and a gain parameter of the audio object.

The decoder 400 of FIG. 4 is designed to generate an audio output signal comprising one or more audio output channels based on the downmix signal and based on the metadata signal.

The encoder 200 of the system of FIG. 4 can be a device according to FIG.

Or, the decoder 300 of the system of FIG. 4 may be a device according to FIG. 3.

Or, the encoder 200 of the system of FIG. 4 can be a device according to FIG. 2, and the decoder 300 of the system of FIG. 4 can be a device of FIG.

The following embodiments can be implemented equally in a device in FIG. 1 and in a device in FIG. 2 and in a device in FIG. 3. They can also be implemented in an encoder 200 of the system of FIG. 4, as well as in a decoder 300 of the system of FIG. 4.

According to one embodiment, the processor unit 120 can be designed, for example, to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface 110 is applied to the audio object signal of each of the audio objects of the processing object group of audio objects is applied. The processor unit 120 can be designed, for example, not to apply the at least one effect parameter that was specified by means of the interface to any audio object signal of the one or more audio objects of the plurality of audio objects that do not belong to the processing object group of audio objects.

Such an application of the effect parameter can for example consist in applying the effect parameter to the audio object signal of each audio object of the processing object group, for example modifying a certain frequency range of the audio object signal of each of the audio objects of the processing object group.

In one embodiment, the processor unit 120 can be designed, for example, to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface 110 is applied to the gain parameter of the metadata of each of the audio objects of the processing object group of audio objects will. The processor unit 120 can be designed, for example, not to apply the at least one effect parameter that was specified by means of the editing parts to any amplification parameter of the audio object metadata of the one or more audio objects of the plurality of audio objects belonging to the processing object group of audio objects do not belong.

As already described, in such an embodiment, the amplification parameters of the audio object metadata of each of the audio objects of the processing object group can, for example, be increased accordingly as a function of the effect parameters.

(e.g. increased by + 3dB) or reduced.

According to one embodiment, the processor unit 120 can be designed, for example, to generate the processed signal in such a way that the at least one effect parameter, which was specified by means of the interface 110, is applied to the position of the metadata of each of the audio objects of the processing object. Group of audio objects is applied. The processor unit 120 can be designed, for example, not to apply the at least one effect parameter that was specified by means of the interface to any position of the audio object metadata of the one or more audio objects of the plurality of audio objects that the processing object group of audio objects does not belong.

As already described, in such an embodiment the position of the audio object metadata of each of the audio objects of the processing object group can be changed accordingly, for example as a function of the effect parameter. This can be done, for example, by specifying the corresponding x, y and z coordinate values by which the position of each of the audio objects is to be shifted. Or, for example, a shift by a certain angle, rotated around a defined center point, for example around a user position, can be specified, Or, but, for example, a doubling (or, for example, halving) the distance to a certain point can be the effect Parameters can be provided for the position of each audio object in the processing object group.

In one embodiment, the interface 110 can for example be designed to specify at least one definition parameter of the processing object group of audio objects by the user. The processor unit 120 can for example be designed, depending on the at least one definition parameter of the processing object group of audio objects, which was specified by means of the interface 110, to determine which audio objects of the plurality of audio objects of the processing object group of Belong to audio objects.

Thus, according to one embodiment, the at least one definition parameter of the processing object group of audio objects can include, for example, at least one position of an area of interest (the position of the area of interest being, for example, the center or focus of the area of interest). The area of interest can be assigned to the processing object group of audio objects. The processor unit 120 can be designed, for example, to add to the plurality of audio objects for each audio object as a function of the position of the audio object metadata of this audio object and as a function of the position of the region of interest

determine whether this audio object belongs to the processing object group of audio objects.

In one embodiment, the at least one definition parameter of the processing object group of audio objects can, for example, furthermore comprise a radius of the region of interest that is assigned to the processing object group of audio objects. The processor unit 120 can be designed, for example, to decide for each audio object of the plurality of audio objects depending on the position of the audio object metadata of this audio object and depending on the position of the area of interest and depending on the radius of the area of interest whether this audio object belongs to the processing object group of audio objects.

For example, a user can specify a position of the processing object group and a radius of the processing object group. The position of the processing object group can specify a spatial center point, and the radius of the processing object group then defines a circle together with the center point of the processing object group. All audio objects with a position within the circle or on the circular line can then be defined as audio objects of this processing object group; all audio objects with a position outside the circle are then not included in the processing object group. The area within the circular line and on the circular line can then be understood as an “area of interest”.

According to one embodiment, the processor unit 120 can be designed, for example, to determine a weighting factor for each of the audio objects of the processing object group of audio objects as a function of a distance between the position of the audio object metadata of this audio object and the position of the region of interest. The processor unit 120 can be designed, for example, for each of the audio objects of the processing object group of audio objects, the weighting factor of this audio object together with the at least one effect parameter, which was specified by means of the interface 110, on the audio object signal or on the gain parameter apply the audio object metadata of that audio object.

In such an embodiment, the influence of the effect parameter on the individual audio objects of the processing object group is individualized for each audio object in that, in addition to effect parameters, an individual weighting factor is determined for each audio object, which is applied to the audio object.

In one embodiment, the at least one definition parameter of the processing object group of audio objects can include, for example, at least one angle which specifies a direction from a defined user position in which an area of interest is located which is assigned to the processing object group of audio objects . The processor unit 120 can be designed, for example, for each audio object of the plurality of audio objects depending on the position of the metadata of this audio object and depending on the angle that specifies the direction from the defined user position in which the person of interest is Area to determine whether this audio object belongs to the processing object group of audio objects.

According to one embodiment, the processor unit 120 can be designed, for example, to determine a weighting factor for each of the audio objects of the processing object group of audio objects, which weighting factor depends on a difference between a first angle and a wider angle, the first angle being the angle that specifies the direction from the defined user position in which the area of interest is located, and the further angle depends on the defined user position and on the position of the metadata of this audio object. The processor unit 120 can be designed, for example, for each of the audio objects of the processing object group of audio objects, the weighting factor of this audio object together with the at least one effect parameter,

In one embodiment, the processing object group of audio objects can be, for example, a first processing object group of audio objects, wherein, for example, one or more further processing object groups of audio objects can also exist.

Each processing object group of the one or more further processing object groups of audio objects can include one or more audio objects of the plurality of audio objects, with at least one audio object of a processing object group of the one or more further processing object groups of audio objects not being an audio object of the first processing object -Group of audio objects is.

Here, the interface 110 can be used for each processing object group of the one or more further processing object groups of audio objects for specifying.

At least one further effect parameter for this processing object group of audio objects is designed by the user.

The processor unit 120 can be designed to generate the processed signal in such a way that for each processing object group of the one or more further processing object groups of audio objects the at least one further effect parameter of this processing object group that was specified by means of the interface 110 , is applied to the audio object signal or to the audio object metadata of each of the one or more audio objects of this processing object group, one or more audio objects of the plurality of audio objects not belonging to this processing object group.

Here, the processor unit 120 can be designed, for example, not to apply the at least one further effect parameter of this processing object group, which was specified by means of the interface, to no audio object signal and no audio object metadata of the one or more audio objects that do not belong to this processing object group .

In such embodiments there can therefore be more than one processing object group. One or more separate effect parameters are determined for each of the processing object groups.

According to one embodiment, the interface 110 can be configured, in addition to the first processing object group of audio objects, for example for specifying the one or more further processing object groups of one or more audio objects by the user, by using the interface 110 for each processing object group the one or more further processing object groups of one or more audio objects is designed for the user to specify at least one definition parameter of this processing object group.

The processor unit 120 can be designed, for example, for each processing object group of the one or more further processing object groups of one or more audio objects depending on the at least one definition parameter of this processing object group, which is specified by means of the interface 110 was used to determine which audio objects of the plurality of audio objects belong to this processing object group.

Concepts of embodiments of the invention and preferred embodiments are presented below.

In embodiments, any types of global adaptations are made possible in OBAP by converting global adaptations into individual changes to the audio objects concerned (eg by the processor unit 120).

Spatial mastering for object-based audio production can be implemented as follows, for example, by implementing processing objects according to the invention.

The proposed implementation of overall adjustments is implemented using processing objects (POs). Just like conventional audio objects, these can be positioned anywhere in a scene and freely in real time. The user can apply any signal processing to the processing object (on the processing object group), for example equalizer (EQ) or compression. For each of these processing tools, the parameter settings of the processing object can be converted into object-specific settings. Various methods are presented for this calculation.

An area of interest is considered below.

5 shows a processing object with the area A and the fading area Af according to one embodiment.

As shown in Fig. 5, the user defines an area A and a masking area Af around the processing object. The processing parameters of the processing object are divided into constant parameters and weighted parameters. Values of constant parameters are unchanged by all audio objects within A and /! / inherited. Weighted parameter values are only inherited by audio objects within A Audio objects within / J / are weighted with a distance factor. The decision as to which parameters are weighted and which are not depends on the parameter type.

Given the user-defined value px t of such a weighted parameter for the processing object, the parameter function p is defined for each audio object S ,, as follows:

(1 )

where the factor f is given as follows:

Consequently, if the user specifies r A = 0 that there is no validity range, weighted parameters are kept constant within this range.

A calculation of inverse parameters according to an embodiment is described below.

6 shows a processing object with the area A and object radii according to an embodiment.

User adjustments to the processing object, which are converted using equation (1), may not always lead to the desired results quickly enough, since the exact position of audio objects is not taken into account. For example, if the area around the processing object is very large and the audio objects contained are far away from the processing object position, the effect of calculated adjustments may not even be audible at the processing object position:

Gain parameters for another calculation method is based on the sounding rate from any object imaginable Again, it is within a user inte ressierenden area, which is shown in Figure 6, the individual parameters p t for each audio object is then calculated as follows.

where h could be defined as follows

hi (t) = sgn g e (t) * (| ge (1 + 11 0 * l ° g] o (^ y) I)

(4)

aj is a constant for the closest possible distance to an audio object, and d, (t) is the distance from the audio object to the EQ object. Derived from the distance law, the function was modified to correctly handle possible positive or negative EQ gain changes.

In the following modified embodiment, an angle-based calculation takes place.

The previous calculations are based on the distance between audio objects and the processing object. However, from a user's perspective, the angle between the processing object and the surrounding audio objects may occasionally more accurately represent their auditory impression. [5] suggests global control of any audio plug-in parameter via the azimuth of audio objects. This approach can be adopted by calculating the difference with respect to the angle er between the processing object with offset angle a q and audio objects 5 : in its vicinity, as shown in FIG. 7.

Thus, FIG. 7 shows a relative angle of audio objects to the processing object according to one embodiment.

The user-defined area of interest mentioned above could accordingly be changed using the angles a A and a A , which is shown in FIG. 8

8 shows an equalizer object with a new radial circumference according to an embodiment.

With regard to the blanking area, A f , f. Would have to be redefined as follows:

Although for the modified approach presented in above, the distance d t in this context could simply be interpreted as the angle between the audio object and the EQ object, this would no longer justify applying the law of distance. Therefore, only the user-defined area is changed while the gain calculation is kept as before.

In one embodiment, equalization is implemented as the application.

Equalization can be considered the most important tool in mastering, as the frequency response of a mix is the most critical factor for a good translation across playback systems.

The proposed implementation of an equalization is realized via EQ objects. Since all other parameters are not distance-dependent, only the gain parameter is of particular interest.

In a further embodiment, dynamic control is implemented as the application.

Traditional mastering uses dynamic compression to control dynamic variations in a mix over time. Depending on the compression settings, this changes the perceived density and the transient response of a mix. In the case of firm compression, the perceived change in density is referred to as, glue, while stronger compression settings for pump or side-chain effects so-called beat heavy mixes can be used.

With OBAP, the user could easily define identical compression settings for several neighboring objects in order to implement multi-channel compression. However, the summed compression on groups of audio objects would not only be advantageous for time-critical work processes, but it would also be more likely that the psychoacoustic impression would be fulfilled by so-called "glued" signals.

9 shows a signal flow of a compression of the signals from n sources according to an embodiment.

According to a further embodiment, scene transformation is implemented as the application.

In stereo mastering, centering / side processing is a commonly used technique to expand or stabilize the stereo image of a mix. A similar option can be helpful for spatial audio mixes if the mix was created in an acoustically critical environment with potentially asymmetrical room or loudspeaker properties. New creative opportunities could also be provided for the ME to improve the effects of a mix.

Fig. 10 shows a scene transformation using a control panel M according to an embodiment. Specifically, Fig. 10 shows a schematic implementation using a distortion area with user-draggable edges C 1 to C £.

A two-dimensional transformation of a scene in the horizontal plane can be implemented using a homography transformation matrix H, which maps each audio object at position p to a new position p r , see also [7]:

When the user is using a control box M to M ' using the four draggable

Corners distorted (see Figure 6), their 2D coordinates can be for a linear
System of equations can be used to obtain the coefficients of 77 in (7)

Since audio object positions can vary over time, the coordinate positions can be interpreted as time-dependent functions.

Dynamic equalizers are implemented in embodiments. Other embodiments implement multi-band compression.

Object-based sound adjustments are not limited to the introduced equalizer applications.

The above description is supplemented again below by a more general description of exemplary embodiments.

Object-based three-dimensional audio production follows the approach that audio scenes are calculated and played back in real time for as many loudspeaker configurations as possible using a rendering process. Audio scenes describe the arrangement of audio objects as a function of time. Audio objects consist of audio signals and metadata. This metadata includes position in the room and volume. Previously, in order to edit the scene, the user had to change all audio objects of a scene individually.

If, on the one hand, a processing object group and, on the other hand, a processing object are mentioned below, it should be noted that a processing object group is always defined for each processing object, which includes audio objects. The processing object group is also referred to, for example, as the container of the processing object. A group of audio objects from the plurality of audio objects is therefore defined for each processing object. The corresponding processing object group comprises the group of audio objects specified in this way. A processing object group is therefore a group of audio objects.

Processing objects can be defined as objects that can change the properties of other audio objects. Processing objects are artificial containers to which any audio objects can be assigned, ie all of its assigned audio objects are addressed via the container. The assigned audio objects can be influenced by any number of effects. Processing objects thus offer the user the option of editing multiple audio objects simultaneously.

A processing object has, for example, position, assignment method, container, weighting method, audio signal processing effects and metadata effects.

The position is a position of the processing object in a virtual scene.

The assignment method assigns audio objects to the processing object (possibly using their position).

The container (or connections) is the set of all audio objects assigned to the processing object (or possibly additional other processing objects).

Weighting methods are the algorithms for calculating the individual effect parameter values for the assigned audio objects.

Audio signal processing effects change the audio component of audio objects (e.g. equalizers, dynamics).

Metadata effects change the metadata of audio objects and / or processing objects (eg position distortion).

Likewise, the above-described position, the assignment method, the container, weighting method, audio signal processing effects and metadata effects can be assigned to the processing object group. The audio objects of the container of the processing object are the audio objects of the processing object group.

1 1 shows the relationship of a processing object with which audio signal effects and metadata effects are brought about, according to one embodiment.

The following describes the properties of processing objects in accordance with special embodiments:

Processing objects can be placed anywhere in a scene by the user, the position can be set constant over time or time-dependent.

Processing objects can be assigned effects by the user that change the audio signal and / or the metadata of audio objects. Examples of effects are equalization of the audio signal, processing of the dynamics of the audio signal, or changing the position coordinates of audio objects.

Processing objects can be assigned any number of effects in any order.

Effects change the audio signal and / or the metadata of the assigned set of audio objects, in each case constant over time or as a function of time.

Effects have parameters that control signal and / or metadata processing. These parameters are divided into constant and weighted parameters by the user or specified depending on the type.

The effects of a processing object are copied and applied to its associated audio objects. The values of constant parameters are adopted unchanged by each audio object. The values of weighted parameters are calculated individually for each audio object according to various weighting methods. The user can select a weighting method for each effect or activate or deactivate this for individual audio sources.

The weighting procedures take into account individual metadata and / or

Signal characteristics of individual audio objects. This corresponds, for example, to the distance between an audio object and the processing object or the frequency spectrum of an audio object. The weighting method can also take into account the listening position of the listener. Furthermore, the mentioned properties of audio objects for the weighting method can also be combined with one another in order to create individual

Deriving parameter values For example, the sound levels of audio objects can be added as part of dynamic processing in order to derive a change in volume individually for each audio object

Effect parameters can be set to be constant or time-dependent over time. The weighting method takes such changes over time into account.

Weighting methods can also process information which the audio renderer analyzes from the scene.

The order in which effects are assigned to the processing object corresponds to the sequence in which signals and / or metadata are processed for each audio object, ie the data changed by a previous effect are used by the next effect as the basis for its calculation. The first effect works on the unchanged data of an audio object.

Individual effects can be deactivated. Then the calculated data of the previous effect, if one exists, is forwarded to the effect after the deactivated effect.

An explicitly newly developed effect is the change in the position of audio objects by means of homography (“distortion effect”). The user is shown a rectangle with individually movable corners at the position of the processing object. If the user moves a corner, a transformation matrix for this distortion is calculated from the previous state of the rectangle and the newly distorted state. The matrix is then applied to all position coordinates of the audio objects assigned to the processing object, so that their position changes according to the distortion.

Effects that only change metadata can also be applied to other processing objects (including "distortion effects").

The assignment of audio sources to the processing objects can be done in different ways. The amount of assigned audio objects can also change over time, depending on the type of assignment. This change is taken into account in all calculations.

A catchment area can be defined around the position of processing objects. All audio objects that are positioned within the catchment area form the assigned set of audio objects to which the effects of the processing object are applied.

The catchment area can be any body (three-dimensional) or any shape (two-dimensional) that is defined by the user.

The center of the catchment area can, but does not have to, correspond to the position of the object to be processed. The user makes this determination.

An audio object lies within a three-dimensional catchment area if its position lies within the three-dimensional body.

An audio object lies within a two-dimensional catchment area if its position projected onto the horizontal plane lies within the two-dimensional form.

The catchment area can assume an unspecified all-encompassing size, so that all audio objects of a scene are in the catchment area.

If necessary, the catchment areas adapt to changes in the scene properties (e.g. scene scaling).

Regardless of the catchment area, processing objects can be linked to any selection of audio objects in a scene.

The coupling can be defined by the user so that all selected audio objects form a set of audio objects to which the effects of the processing object are applied.

The coupling can alternatively be defined by the user in such a way that the processing object adapts its position as a function of time according to the position of the selected audio objects. This adjustment of the position can take into account the listening position of the listener. The effects of the processing object do not necessarily have to be applied to the coupled audio objects

The assignment can take place automatically based on criteria defined by the user. In this case, all audio objects in a scene are continuously examined for the defined criterion or criteria and, if the criterion or criteria are met, assigned to the processing object. The duration of the assignment can be limited to the time the criteria are met, or transition periods can be defined. The transition periods determine how long one or more criteria must be continuously fulfilled by the audio object so that it is assigned to the processing object or how long one or more criteria must be continuously violated so that the assignment to the processing object is canceled again will.

Processing objects can be deactivated by the user so that their properties are retained and are still displayed to the user, but the processing object does not influence audio objects.

Any number of properties of a processing object can be linked by the user to any number of other processing objects of the same type. These properties include parameters of effects. The coupling can be selected absolutely or relatively by the user. With constant coupling, the changed property value of a processing object is exactly adopted by all coupled processing objects. With relative coupling, the value of the change is offset against the property values of coupled processing objects.

Processing objects can be duplicated. A second processing object with identical properties of the original processing objects is generated. The properties of the processing objects are then independent of one another.

Properties of processing objects can, for example, be permanently inherited when copying, so that changes made by parents are automatically adopted by children,

FIG. 12 shows the change in audio objects and audio signals in response to an input by a user according to an embodiment.

Another new application of processing objects is the intelligent calculation of parameters by means of a scene analysis. The user defines effect parameters at a certain position via the processing object. The audio renderer performs a predictive scene analysis to detect which audio sources have an influence on the position of the processing object Position of the processing object can be reached.

Further exemplary embodiments of the invention, which are visually represented by means of FIGS. 13-25, are described below.

13 shows processing object PO 4 with rectangle M for distortion of corners C 1, C 2, C 3 and C 4 by the user. 13 shows schematically a possible distortion towards M 'with the corners CY, C 2 ', C 3 'and C 4 ', as well as the corresponding effect on the sources S 1 , S 2 , S 3 and S 4 with their new positions SG S 2 ', S 3 ' and S 4 '.

Fig. 14 shows processing objects RO Ί and P0 2 with their respective, overlapping two-dimensional catchment areas A and B, and the distances ag r ag 2 and ag 3 or bs 3 , bg 4 and bg 6 from the respective processing object to the through the catchment areas assigned sources Si, S 2 , S 3 , S 4 and S 6 -

Fig. 15 shows processing object PO ; < with rectangular, two-dimensional Einzugbe rich C and the angles between P0 3 and the associated sources S 1 , S 2 and S 3 for a possible weighting of parameters that include the listening position of the listener. The angles can be determined by the difference between the azimuth of the individual sources and the azimuth a po of P0 3 .

16 shows a possible schematic implementation of an equalizer effect which has been applied to a processing object. The weighting for the respective parameter can be activated using buttons such as w next to each parameter. For the weighted parameters mentioned, m, m 2 and m 3 offer options for the weighting method.

17 shows the processing object P0 5 with a three-dimensional catchment area D and the respective distances dg r dg 2 and dg 3 to the sources S 1 S 2 and S 3 assigned over the catchment area .

18 shows a prototypical implementation of a processing object to which an equalizer has been applied. The turquoise object with the wave symbol on the right-hand side of the image shows the processing object in the audio scene, which the user can move freely with the mouse. Within the turquoise, transparent homogeneous area around the processing object, the equalizer parameters are applied unchanged to the audio objects Src1, Src2 and Src3 as defined on the left side of the screen

The circular area shows the transparent shading of the area in which all parameters except for the gain parameters are taken over unchanged from the sources. The gain parameters of the equalizer, on the other hand, are weighted depending on the distance between the sources and the processing object. Since only source Src4 and source Src24 are in this area, in this case a weighting only takes place for their parameters. Source Src22 is not influenced by the processing object. The user controls the size of the radius of the circular area around the processing object using the "Area" slider. He uses the “Feather” slider to control the size of the radius of the surrounding transition area.

FIG. 19 shows a processing object as in FIG. 18, only at a different position and without a transition surface. All parameters of the equalizer are transferred unchanged to the sources Src22 and Src4. The sources Src3, Src2, Src1 and Src24 are not influenced by the processing object.

20 shows a processing object with an area defined via its azimuth as the catchment area, so that the sources Src22 and Sre4 are assigned to the processing object. The tip of the catchment area in the middle of the right-hand side of the picture corresponds to the position of the listener / user. When the processing object is moved, the area is moved along with it in accordance with the azimuth. The user determines the size of the angle of the catchment area with the "Area" slider. The user can change from a circular to an angle-based catchment area via the lower selection field above the "Area" / "Feather" sliders, which is now "radius" indicates.

FIG. 21 shows a processing object as in FIG. 20, but with an additional transition area that can be controlled by the user using the “Feather” slider.

Fig. 22 shows several processing objects in the scene, with different catchment areas. The gray processing objects have been deactivated by the user; h They do not affect the audio objects in their catchment area. The equalizer parameters of the currently selected processing object are always displayed on the left-hand side of the screen. The selection is indicated by a thin, light turquoise line around the object.

Fig. 23 shows the red square on the right side of the picture shows a processing object for horizontal distortion of the position of audio objects. The user can drag the corners in any direction with the mouse to distort the scene

Fig. 24 shows the scene after the user has dragged the corners of the processing object. The position of all sources has changed according to the distortion.

25 shows a possible visualization of the assignment of individual audio objects to a processing object.

Although some aspects have been described in connection with a device, it goes without saying that these aspects also represent a description of the corresponding method, so that a block or a component of a device is also to be understood as a corresponding method step or as a feature of a method step . Analogously to this, aspects that have been described in connection with or as a method step also represent a description of a corresponding block or details or features of a corresponding device -Apparats), such as a microprocessor, a programmable coraputer or an electronic circuit.

Depending on specific implementation requirements, exemplary embodiments of the invention can be implemented in hardware or in software or at least partially in hardware or at least partially in software. The implementation can be using a digital storage medium, for example a floppy disk, a DVD, a BluRay disc, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, a hard disk or another magnetic or optical memory are carried out on the electronically readable control signals are stored, which can interact with a programmable computer system or cooperate in such a way that the respective method is carried out. Therefore, the digital storage medium can be computer readable.

Some exemplary embodiments according to the invention thus comprise a data carrier which has electronically readable control signals which are able to interact with a programmable computer system in such a way that one of the methods described herein is carried out.

In general, exemplary embodiments of the present invention can be implemented as a computer program product with a program code, the program code

is effective to perform one of the methods when the computer program product runs on a computer.

The program code can, for example, also be stored on a machine-readable carrier.

Other exemplary embodiments include the computer program for performing one of the methods described herein, the computer program being stored on a machine-readable carrier. In other words, an exemplary embodiment of the method according to the invention is thus a computer program which has a program code for performing one of the methods described herein when the computer program runs on a computer.

A further exemplary embodiment of the method according to the invention is thus a data carrier (or a digital storage medium or a computer-readable medium) on which the computer program for performing one of the methods described herein is recorded. The data carrier or the digital storage medium or the computer-readable medium are typically tangible and / or non-volatile.

A further exemplary embodiment of the method according to the invention is thus a data stream or a sequence of signals which represents or represents the computer program for performing one of the methods described herein. The data stream or the sequence of signals can, for example, be configured to be transferred via a data communication connection, for example via the Internet.

Another exemplary embodiment comprises a processing device, for example a computer or a programmable logic component, which is configured or adapted to carry out one of the methods described herein.

Another exemplary embodiment comprises a computer on which the computer program for performing one of the methods described herein is installed.

A further exemplary embodiment according to the invention comprises a device or a system which is designed to transmit a computer program to a receiver for carrying out at least one of the methods described herein. The transmission can take place electronically or optically, for example. The receiver can be, for example, a computer, a mobile device, a storage device or a similar device. The device or the system can comprise, for example, a file server for transmitting the computer program to the recipient.

In some exemplary embodiments, a programmable logic component (for example a field-programmable gate array, an FPGA) can be used to carry out some or all of the functionalities of the methods described herein. In some exemplary embodiments, a field-programmable gate array can interact with a microprocessor in order to carry out one of the methods described herein. In general, in some exemplary embodiments, the methods are carried out by any desired flardware device. This can be universal hardware such as a computer processor (CPU) or hardware specific to the method, such as an ASIC.

The above-described embodiments are merely illustrative of the principles of the present invention. It is to be understood that modifications and variations of the arrangements and details described herein will be apparent to others skilled in the art. It is therefore intended that the invention be limited solely by the scope of protection of the following patent claims and not by the specific details presented herein on the basis of the description and the explanation of the exemplary embodiments.

credentials

[1] Coleman, P., Franck, A., Francombe, J., Liu, G., Campos, TD, Fluges, R., Menzies, D., Galvez, MS, Tang, Y., Woodcock, J., Jackson, P., Melchior, F., Pike, C., Fazi, F., Cox, T., and Hilton, A., “An Audio-Visual System for Object-Based Audio:

From Recording to Listening, ”IEEE Transactions on Multimedia, PP (99), pp. 1-1, 2018, ISSN 1520- 9210, doi: 10.1 109 / TMM.2018.2794780.

[2] Gasull Ruiz, A., Sladeczek, C., and Sporer, T., “A Description of an Object-Based Audio Workflow for Media Productions,” in the Audio Engineering Society Conference:

57th International Conference: The Future of Audio Entertainment Technology, Cinema, Television and the Internet, 2015.

[3] Melchior, F., Michaelis, U., and Steffens, R., “Spatial Mastering - a new concept for spatial sound design in object-based audio scenes,” in Proceedings of the International Computer Music Conference 2011, 2011 .

[4] Katz, B. and Katz, RA, Mastering Audio: The Art and the Science, Butterworth- Heinemann, Newton, MA, USA, 2003, ISBN 0240805453. AES Conference on Spatial Reproduction, Tokyo, Japan, 2018 August 6 - 9, page 2

[5] Melchior, F., Michaelis, U., and Steffens, R., “Spatial Mastering - A New Concept for Spatial Sound Design in Object-based Audio Scenes,” Proceedings of the International Computer Music Conference 2011, University of Huddersfield, UK, 201 1

[6] Sladeczek, C., Neidhardt, A, Böhme, M., Seeber, M., and Ruiz, AG, “An Approach for Fast and Intuitive Monitoring of Microphone Signals Using a Virtual Listener," Proceedings, International Conference on Spatial Audio (ICSA), February 21-23, 2014, Erlangen, 2014

[7] Dubrofsky, E., Homography Estimation, Master's thesis, University of British Columbia, 2009.

[8] ISO / IEC 23003-2: 2010 Information technology - MPEG audio technologies - Part

2: Spatial Audio Object Coding (SAOC); 2010

Claims

An apparatus for generating a processed signal using a plurality of audio objects, each audio object of the plurality of audio objects including an audio object signal and audio object metadata, the audio object metadata including a position of the audio object and a gain parameter of the audio object, the apparatus comprising :

an interface (110) for specifying at least one effect parameter of a processing object group of audio objects by a user, wherein the processing object group of audio objects comprises two or more audio objects of the plurality of audio objects, and

a processor unit (120) which is designed to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface (110) is applied to the audio object signal or to the audio object metadata of each of the audio objects of the processing objects -Group of audio objects is applied.

2. Device according to claim 1,

wherein one or more audio objects of the plurality of audio objects do not belong to the processing object group of audio objects, and wherein the processor unit (120) is designed to set the at least one effect parameter that was specified by means of the interface to no audio object signal and no audio object Apply metadata of the one or more audio objects that do not belong to the processing object group of audio objects.

3. Device according to claim 2,

wherein the processor unit (120) is designed to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface (110) is applied to the audio object signal of each of the audio objects of the processing object group of audio objects ,

wherein the processor unit (120) is designed not to apply the at least one effect parameter that was specified by means of the interface to no audio object signal of the one or more audio objects of the plurality of audio objects that do not belong to the processing object group of audio objects.

4. Apparatus according to claim 2 or 3,

wherein the processor unit (120) is designed to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface (110) is applied to the gain parameter of the metadata of each of the audio objects of the processing object group of audio objects will,

wherein the processor unit (120) is designed not to apply the at least one effect parameter specified by means of the interface to any amplification parameter of the audio object metadata of the one or more audio objects of the plurality of audio objects that do not belong to the processing object group of audio objects .

5. Device according to one of claims 2 to 4,

wherein the processor unit (120) is designed to generate the processed signal in such a way that the at least one effect parameter specified by means of the interface (110) is applied to the position of the metadata of each of the audio objects of the processing object group of audio objects will,

wherein the processor unit (120) is designed not to apply the at least one effect parameter that has been specified by means of the interface to any position of the audio object metadata of the one or more audio objects of the plurality of audio objects that the processing object group of audio objects does not belong.

6. Device according to one of the preceding claims,

wherein the interface (1 10) is designed to specify at least one definition parameter of the processing object group of audio objects by the user,

wherein the processor unit (120) is designed, depending on the at least one definition parameter of the processing object group of audio objects, which was specified by means of the interface (1 10) to determine which audio objects of the plurality of audio objects of the processing object Belong to a group of audio objects.

7. Apparatus according to claim 6,

wherein the at least one definition parameter of the processing object group of audio objects comprises at least one position of an area of interest which is assigned to the processing object group of audio objects, and

wherein the processor unit (120) is designed to determine for each audio object of the plurality of audio objects depending on the position of the audio object metadata of this audio object and depending on the position of the region of interest whether this audio object belongs to the processing object group of audio objects.

8. Apparatus according to claim 7,

wherein the at least one definition parameter of the processing object group of audio objects further comprises a radius of the region of interest which is assigned to the processing object group of audio objects, and

the processor unit (120) being designed to decide for each audio object the plurality of audio objects depending on the position of the audio object metadata of this audio object and depending on the position of the area of interest and depending on the radius of the area of interest whether this audio object belongs to the processing object group of audio objects.

9. Apparatus according to claim 7 or 8,

wherein the processor unit (120) is designed to determine a weighting factor for each of the audio objects of the processing object group of audio objects as a function of a distance between the position of the audio object metadata of this audio object and the position of the region of interest, and

wherein the processor unit (120) is designed, for each of the audio objects of the processing object group of audio objects, the weighting factor of this audio object together with the at least one effect parameter that was specified by means of the interface (1 10) on the audio object signal or on the Apply gain parameters of the audio object metadata of this audio object.

10. Apparatus according to claim 6,

wherein the at least one definition parameter of the processing object group of audio objects comprises at least one angle which specifies a direction from a defined user position in which an area of interest is located which is assigned to the processing object group of audio objects, and

wherein the processor unit (120) is designed, for each audio object of the plurality of audio objects as a function of the position of the metadata of this audio object and as a function of the angle which specifies the direction from the defined user position in which the area of interest is located, to determine whether this audio object belongs to the processing object group of audio objects.

1 1. Device according to claim 10,

wherein the processor unit (120) is designed to determine a weighting factor for each of the audio objects of the processing object group of audio objects which depends on a difference between a first angle and a further angle, the first angle being the angle which is the direction of the defined user position in which the area of interest is located, and the further angle depends on the defined user position and on the position of the metadata of this audio object,

12. Device according to one of the preceding claims,

wherein the processing object group of audio objects is a first processing object group of audio objects, with one or more further processing object groups of audio objects also existing, each processing object group of the one or more further processing object groups of audio objects being one or more audio objects the plurality of audio objects, wherein at least one audio object of a processing object group of the one or more further processing object groups of audio objects is not an audio object of the first processing object group of audio objects,

wherein the interface (1 10) is designed for each processing object group of the one or more further processing object groups of audio objects for specifying at least one further effect parameter for this processing object group of audio objects by the user,

wherein the processor unit (120) is designed to generate the processed signal in such a way that for each processing object group of the one or more further processing object groups of audio objects the at least one further effect parameter of this processing object group, which is generated by means of the interface (1 10) has been specified, each of the one or more audio objects of this processing object group is applied to the audio object signal or to the audio object metadata, wherein one or more audio objects of the plurality of audio objects do not belong to this processing object group, and wherein the processor unit (120) is designed, the at least one further effect parameter of this processing object group, which was specified by means of the interface,not to apply any audio object signal or audio object metadata of the one or more audio objects that do not belong to this processing object group.

13. Apparatus according to claim 12,

wherein the interface (1 10) is designed in addition to the first processing object group of audio objects for specifying the one or more further processing object groups of one or more audio objects by the user by the interface (1 10) for each processing object The group of the one or more further processing object groups of one or more audio objects is designed for the user to specify at least one definition parameter of this processing object group,

wherein the processor unit (120) is designed, for each processing object group of the one or more further processing object groups of one or more audio objects depending on the at least one definition parameter of this processing object group, which is determined by means of the interface (1 10) has been specified to determine which audio objects of the plurality of audio objects belong to this processing object group.

14. Device according to one of the preceding claims,

wherein the device is an encoder, wherein the processor unit (120) is configured to generate a downmix signal using the audio object signals of the plurality of audio objects, and wherein the processor unit (120) is configured to use the audio object metadata of the A plurality of audio objects to generate a metadata signal signal,

wherein the processor unit (120) is designed to generate the downmix signal as the processed signal, at least one modified object signal for each audio object of the processing object group of audio objects being mixed in the downmix signal, the processor unit (120) being designed to generate the modified object signal of this audio object for each audio object of the processing object group of audio objects by means of the application of the at least one effect parameter, which was specified by means of the interface (1 0), to the audio object signal of this audio object, or

wherein the processor unit (120) is designed to generate the metadata signal as the processed signal, wherein the metadata signal comprises at least one modified position for each audio object of the processing object group of audio objects, the processor unit (120) being designed for each audio object of the processing object group of audio objects to generate the modified position of this audio object by means of the application of the at least one effect parameter, which was specified by means of the interface (1 10), to the position of this audio object, or

wherein the processor unit (120) is designed to generate the metadata signal as the processed signal, the metadata signal at least one

modified amplification parameters for each audio object of the processing object group of audio objects, wherein the processor unit (120) is designed, for each audio object of the processing object group of audio objects, the modified amplification parameter of this audio object by means of the application of the at least one effect parameter that is generated by means of the interface (110) was specified to generate the gain parameter of this audio object.

15. Device according to one of claims 1 to 13,

wherein the device is a decoder, wherein the device is designed to receive a downmix signal in which the plurality of audio object signals of the plurality of audio objects are mixed, wherein the device is further designed to receive a metadata signal, the metadata signal for each audio object the plurality of audio objects includes the audio object metadata of that audio object,

wherein the processor unit (120) is designed to reconstruct the plurality of audio object signals of the plurality of audio objects based on a downmix signal,

wherein the processor unit (120) is designed to generate an audio output signal comprising one or more audio output channels as the processed signal,

wherein the processor unit (120) is designed to apply the at least one effect parameter, which has been specified by means of the interface (110), to the audio object signal of each of the audio objects of the processing object group of audio objects, or to generate the processed signal to apply the at least one effect parameter specified by means of the interface (1 10) to the position or to the gain parameter of the audio object metadata of each of the audio objects of the processing object group of audio objects.

16. The device according to claim 15,

wherein the interface (1 10) is further designed for the specification of one or more rendering parameters by the user,

and wherein the processor unit (120) is configured to generate the processed signal using the one or more rendering parameters as a function of the position of each audio object of the processing object group of audio objects.

17. System, comprising,

an encoder (200) for generating a downmix signal based on audio object signals of a plurality of audio objects and for generating a metadata signal based on audio object metadata of the plurality of audio objects, the audio object metadata a position of the audio object and comprise a gain parameter of the audio object, and

a decoder (300) for generating an audio output signal comprising one or more audio output channels based on the downmix signal and based on the metadata signal,

wherein the encoder (200) is a device according to claim 14, or

wherein the decoder (300) is a device according to claim 15 or 16, or

wherein the encoder (200) is a device according to claim 14 and the decoder (300) is a device according to claim 15 or 16.

18. A method of generating a processed signal using a plurality of audio objects, wherein each audio object of the plurality of audio objects comprises an audio object signal and audio object metadata, the audio object metadata including a position of the audio object and a gain parameter of the audio object, the method comprising :

Specification of at least one effect parameter of a processing object group of audio objects by a user by means of an interface (110), the processing object group of audio objects comprising two or more audio objects of the plurality of audio objects, and

19. A computer program with a program code for performing the method according to claim 18.

Documents

Application Documents

#	Name	Date
1	202117021604-TRANSLATIOIN OF PRIOIRTY DOCUMENTS ETC. [13-05-2021(online)].pdf	2021-05-13
2	202117021604-STATEMENT OF UNDERTAKING (FORM 3) [13-05-2021(online)].pdf	2021-05-13
3	202117021604-REQUEST FOR EXAMINATION (FORM-18) [13-05-2021(online)].pdf	2021-05-13
4	202117021604-NOTIFICATION OF INT. APPLN. NO. & FILING DATE (PCT-RO-105) [13-05-2021(online)].pdf	2021-05-13
5	202117021604-FORM 18 [13-05-2021(online)].pdf	2021-05-13
6	202117021604-FORM 1 [13-05-2021(online)].pdf	2021-05-13
7	202117021604-DRAWINGS [13-05-2021(online)].pdf	2021-05-13
8	202117021604-DECLARATION OF INVENTORSHIP (FORM 5) [13-05-2021(online)].pdf	2021-05-13
9	202117021604-COMPLETE SPECIFICATION [13-05-2021(online)].pdf	2021-05-13
10	202117021604-Proof of Right [06-07-2021(online)].pdf	2021-07-06
11	202117021604-FORM-26 [06-07-2021(online)].pdf	2021-07-06
12	202117021604.pdf	2021-10-19
13	202117021604-FORM 3 [19-10-2021(online)].pdf	2021-10-19
14	202117021604-FER.pdf	2022-03-22
15	202117021604-Information under section 8(2) [29-06-2022(online)].pdf	2022-06-29
16	202117021604-FORM 4(ii) [13-09-2022(online)].pdf	2022-09-13
17	202117021604-FORM 3 [21-10-2022(online)].pdf	2022-10-21
18	202117021604-OTHERS [22-12-2022(online)].pdf	2022-12-22
19	202117021604-Information under section 8(2) [22-12-2022(online)].pdf	2022-12-22
20	202117021604-FER_SER_REPLY [22-12-2022(online)].pdf	2022-12-22
21	202117021604-CLAIMS [22-12-2022(online)].pdf	2022-12-22
22	202117021604-Information under section 8(2) [17-01-2023(online)].pdf	2023-01-17
23	202117021604-FORM 3 [13-04-2023(online)].pdf	2023-04-13
24	202117021604-Information under section 8(2) [19-06-2023(online)].pdf	2023-06-19
25	202117021604-FORM 3 [11-09-2023(online)].pdf	2023-09-11
26	202117021604-Information under section 8(2) [18-09-2023(online)].pdf	2023-09-18
27	202117021604-Information under section 8(2) [08-11-2023(online)].pdf	2023-11-08
28	202117021604-PatentCertificate13-12-2023.pdf	2023-12-13
29	202117021604-IntimationOfGrant13-12-2023.pdf	2023-12-13

Search Strategy

1	SEARCHSTRATEGYE_21-03-2022.pdf