Methods And Systems For Presenting Three Dimensional Motion Pictures With Content Adaptive Information
Abstract:
The present invention relates generally to methods and systems for the production of 3D motion picture subtitles
adapted to image content tor improved viewer experience. Some embodiments of the present invention relate to positioning subtitles
at variable, scene-dependent depth. Certain aspects of the present invention may be applicable to general 3D display applications
and/or digital projection of 3D motion pictures.
Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence
METHODS AND SYSTEMS FOR PRESENTING THREE-DIMENSIONAL MOTION
PICTURES WITH CONTENT ADAPTIVE INFORMATION
Cross-Reference to Related Applications
[0001] This application claims priority to U.S. Provisional Application Serial No.
61/200,725, titled "Methods and Systems for Presenting Three-Dimensionai Motion
Pictures with Content Adaptive Three-Dimensionai Subtitles," and filed December 1,
2008, the entire contents of which are incorporated herein by reference.
Field of the Disclosure
[0002] This disclosure relates generally to three-dimensional image processing
and, more particularly, to processing images to display additional information, such
as subtitles, with a three-dimensional (3D) image based on content of the 3D image.
Background
[0003] Subtitles are textual representations of aural dialog that has been
translated into a language that is typically different from the original version in a
motion picture presentation. Subtitles may be captions that can be used to describe
both the aural dialogue and sound descriptions to aid hearing-impaired presentation
viewers. Caption text may be displayed on the screen or displayed separately. The
term "subtitle" refers to any text or graphic displayed on the picture presentation
screen. A subtitle is a type of "additional information" that may be displayed in
addition to the picture. Subtitles are displayed on a screen, usually at the bottom of
the screen, to help the audience follow the dialog in the movie, such as dialog
spoken in a language the audience may not understand or to assist audience
members who have difficulty hearing sounds.
[0004] Typically, subtitles are received as a subtitle file that contains subtitle
elements for a motion picture. A subtitle element can include subtitle text and timing
information indicating when the subtitle text should appear and disappear on the
screen. Often, the timing information is based on a time code or other equivalent
information such as film length (e.g. measured in feet and frames). A subtitle file can
also include other attributes such as text fonts, text color, subtitle screen positioning
and screen alignment information, which describe how subtitles should appear on
the screen. A conventional subtitle display system interprets the information from a
subtitle file, converts subtitle elements to a graphical representation and displays the
subtitles on a screen in synchronization with images and in accordance with the
information in the subtitle file. The function of a conventional subtitle display system
can be performed by a digital cinema server that superimposes the converted
subtitle representation onto images to be displayed by a digital projector.
[0005] The presentation of a three-dimensional (3D) motion picture is performed
by displaying stereoscopic 3D images in sequence using a stereoscopic 3D display
system. A 3D image includes a left-eye image and a corresponding right-eye image,
representing two slightly different views of the same scene similar to the two
perspectives as perceived by both eyes of a human viewer. The differences
between the left-eye and the right-eye images are referred to as binocular disparity,
which is often used interchangeably with "disparity". Disparity can refer to the
horizontal position difference between a pixel in a left-eye image and the
corresponding pixel in a corresponding right-eye image. Disparity may be measured
by the number of pixels. A similar concept is "parallax" which refers to the horizontal
position distance between such a pair of pixels when displayed on the screen.
Parallax may be measured by a distance measure, such as in inches. The value of
parallax can be related to the value of pixel disparity in the 3D image data by
considering the dimension of the display screen. A 3D motion picture includes
multiple left-eye image sequences and corresponding right-eye image sequences. A
3D display system can ensure that a left-eye image sequence is presented to the left
eye of a viewer and a right-eye image sequence is presented to the right eye of the
viewer, producing the perception of depth. The perceived depth of a pixel in a 3D
image frame can be determined by the amount of parallax between the displayed
left-eye and right-eye views of the corresponding pixel pair. A 3D image with a strong
parallax, or with larger pixel disparity values, appears closer to the human viewer.
[0006] One method of providing subtitles, or any additional information, for a 3D
motion picture includes using a conventional subtitle display system in which a
monoscopic version of subtitle images is displayed on a screen for both the left and
right eyes to see, effectively placing the subtitles at the depth of the screen. When
3D images with a strong parallax are presented with a monoscopic version of
subtitles, an audience may have difficulty reading the subtitles that appear behind
the depth of the images because the eyes of audience members are unable to fuse
the images at one depth and the subtitles at a different depth simultaneousiy.
[0007] A subtitle displayed conventionally with a 3D image is depicted in Figure 1.
The 3D image is displayed that includes a main object 106 that has an apparent
depth of coming out of the screen 102. The monoscopic subtitle text 108 has an
apparent depth of at the screen. When a viewer wearing 3D glasses 104 focuses on
the main object 106, the viewer may perceive the subtitle 108 behind the main object
106 may be perceived as double images 110 and 112. Viewers may experience
difficulty in reading the subtitle text while watching the 3D images. This problem is
particularly unpleasant for an audience in a large-screen 3D cinema venue, such as
an IMAX® 3D theater, where 3D images are presented with a stronger parallax and
appear more immersive and closer to the audience than that in a smaller 3D theater.
[0008] Although this problem is presented for subtitles, any information in addition
to the 3D image to be displayed with the 3D image can experience this and other
problems discussed herein.
[0009] Another method of projecting subtitles for a 3D motion picture with a
conventional subtitle display system is to place the monoscopic version of subtitles
near the top of a screen. Such a method reduces audience-viewing discomfort
since, in most 3D scenes, image content near the top of image frames often have
more distant depth values than image content near the bottom of the image frames.
For example, image content near the top of an image often includes sky, clouds, the
roof of a building or hills that appear far away from the other objects in a scene.
These types of content often have a depth close to or behind the screen depth. A
viewer may find it easier to read the monoscopic version of subtitles while nearby
image content are far away or even behind the screen depth. However, viewers may
continue to experience difficulty when image content near the top of a screen has an
apparent depth that is close to the further. Furthermore, viewers may find it
inconvenient to focus on the top of an image continually to receive subtitle or other
additional information to the image.
[0010] Accordingly, systems and methods are desirable that can cause subtitles
or other additional information to be displayed at an acceptable depth or other
location on the display and with a 3D image .
[0011] Furthermore, although some existing methods can be used to determine
the depth of 3D image content, such existing methods are inapplicable to
determining the depth of 3D image content quickly and dynamically. A conventional
stereo-matching method is unable to deliver accurate disparity results consistently
because it fails to account for temporally changing image content. As a result, the
depth of 3D subtitles computed based on a conventional stereo matching method
may not be temporally consistent and, thus, may result in viewing discomfort by the
audience. Furthermore, a conventional stereo matching method may not be efficient
and sufficiently reliable for automated and real-time computing applications.
Accordingly, systems and methods are also desirable that can be used to determine
a depth of 3D image content quickly and dynamically so that the depth can be used
to locate subtitle or other information in addition to the 3D image content.
Summary
[0012] Certain embodiments relate to processing and displaying subtitles in
stereoscopic three-dimensional (3D) in a 3D motion picture presentation to enable
an audience to read the images and subtitles with ease and comfort. The
stereoscopic 3D subtitles, or 3D subtitles, can be created by displaying a left-eye
subtitle image and a right-eye subtitle image with a proper disparity or parallax.
[0013] In one embodiment, 3D subtitles are processed that have a content
adaptive depth based on 3D images with high levels of computing efficiency and
computing reliability.
[0014] In one embodiment, 3D subtitles are processed that have a content
adaptive depth with high levels of computing efficiency and computing reliability,
based on a compressed version of 3D images available in a form of digital cinema
package (DCP).
[0015] In one embodiment, 3D subtitles that have a content adaptive depth are
processed and displayed, while maintaining a consistent perceived subtitle font size.
[0016] In one embodiment, a 3D digital projection system is provided for
computing and displaying 3D subtitles with content adaptive depth.
[0017] In one embodiment, 3D subtitles with a content adaptive depth, as well as
other content adaptive subtitle attributes including font style, font size, color or
luminance and screen position, are processing and displayed.
[0018] In one embodiment, a 3D digital projection system is provided for
computing and displaying 3D subtitles with content adaptive depth as well as other
content adaptive subtitle attributes including font style, font size, color or luminance
and screen position.
[0019] In an embodiment, a 3D image sequence and a subtitle file for the 3D
image sequence are received. The subtitle file includes a subtitle element and
timing information associated with the subtitle element. The subtitle element is
associated with a segment of the 3D image sequence based on timing information.
An abstract depth map is computed from the segment associated with the subtitle
element. A proxy depth is computed based on the abstract depth map fro the
subtitle element. The proxy depth is used to determine a render attribute for the
subtitle element. The render attribute is outputted.
[0020] In an embodiment, a display medium is provided for displaying images on
the display medium. The display medium includes a 3D image sequence that has
content at variable apparent depths. The display medium also includes a subtitle
element that has an apparent depth that changes based on the variable apparent
depths of the content of the 3D image sequence.
[0021] These illustrative embodiments are mentioned not to limit or define the
disclosure, but to provide examples to aid understanding thereof. Additional
embodiments are discussed in the Detailed Description, and further description is
provided there. Advantages offered by one or more of the various embodiments
may be further understood by examining this specification or by practicing one or
more embodiments presented.
Brief Descriptions of the Drawings
[0022] Figure 1 illustrates a prior art representation of a three-dimensional (3D)
image with monoscopic subtitles displayed on a screen.
[0023] Figure 2 illustrates a representation of a a 3D image with stereoscopic
subtitles displayed on a screen according to one embodiment of the present
invention.
[0024] Figure 3 depicts a system that is capable of determining render attributes
for a stereoscopic subtitle to be displayed on a screen with a 3D image according to
one embodiment of the present invention.
[0025] Figure 4 depicts a flow diagram of a method for computing stereoscopic
subtitles to be displayed with a 3D image according to one embodiment of the
present invention.
[0026] Figure 5 graphically illustrates image abstraction according to one
embodiment of the present invention.
[0027] Figure 6 graphically illustrates vertical sampling projection according to
one embodiment of the present invention.
[0028] Figure 7 graphically illustrates multiple vertical sampling projection
according to one embodiment of the present invention.
[0029] Figure 8 graphically illustrates multi-region image abstraction according to
one embodiment of the present invention.
[0030] Figure 9 graphically illustrates a second embodiment of multi-region image
abstraction.
[0031] Figure 10 graphically illustrates an abstract image pair and an abstract
depth map according to one embodiment of the present invention.
[0032] Figure 11 depicts a functional block diagram of a proxy depth decision
module according to one embodiment of the present invention.
[0033] Figure 12 illustrates disparity distribution of a 3D image segment according
to on embodiment of the present invention.
[0034] Figure 13 illustrates a distogram of a 3D image segment according to one
embodiment of the present invention.
[0035] Figure 14A is an example of conventional subtitle text file according to one
embodiment of the present invention.
[0036] Figure 14B is an example of a 3D subtitle text file with proxy depth
according to one embodiment of the present invention.
[0037] Figure 15 graphically illustrates temporal window selection according to on
embodiment of the present invention.
[0038] Figure 16 graphically illustrates determining a proxy depth from a
distogram according to one embodiment of the present invention.
[0039] Figures 17A and 17B graphically depict selective DCP decoding according
to one embodiment of the present invention.
[0040] Figure 18 graphically depicts JPEG2K Level 3 sub-bands and
corresponding packets according to one embodiment of the present invention.
[0041] Figure 19 is a functional block diagram for an offline content adaptive 3D
subtitle computing system according to one embodiment of the present invention.
[0042] Figure 20 is a functional block diagram for a real-time content adaptive 3D
subtitle computing system according to one embodiment of the present invention.
[0043] Figure 21 is a flow chart of a subtitling controller method according to one
embodiment of the present invention.
Detailed Description
[0044] Certain aspects and embodiments of the inventive concepts disclosed
herein relate to methods and systems for displaying three-dimensional (3D) images
with additional information, such as subtitles, at a location and a depth based on the
content of the 3D images. While the methods disclosed are generally suitable for
any type of 3D stereoscopic display systems, they may have particular applicability
to 3D motion picture theaters with an immersive viewing environment.
[0045] In some embodiments, additional information that is subtitles is displayed
at a depth that is the same as, or is otherwise based on, the depth of content in the
3D image displayed. Figure 2 depicts one embodiment of a subtitle element 214
displayed at a depth that is based on the depth of a main image object 106 in the 3D
image. By displaying the subtitle element 214 at a depth that is based on content of
a 3D image, both the 3D image and the subtitle can be viewed simultaneously and
comfortably by a viewer 104. Furthermore, if the depth of the main image object 106
changes, the depth of the subtitle element 214 can also change based on the
change of depth of the main image object 106.
[0046] The depth placement of the subtitle element 214 can be provided in a
stereoscopic method by displaying a left-eye view and a right-eye view of the same
subtitle element with a proper parallax. The subtitle displayed in such a way can be
referred to as a stereoscopic subtitle or otherwise known as a 3D subtitle. The
amount of parallax that may be needed for the depth placement of the subtitle can
be determined by computing the depth of the main image object 106, or equivalently
by computing the pixel disparity values of the main image object 106..
[0047] The left-eye view and the right-eye view of a 3D subtitle may be created by
horizontally shifting a subtitle element in screen positions. For example, the subtitle
text of the left-eye view may be created by horizontally shifting the subtitle element to
the right by ten pixels while the corresponding right-eye view of the subtitle text may
be created by shifting the subtitle element to the left by ten pixels. The resulting 3D
subtitle thus has a disparity of twenty pixels between the left-eye and right-eye
views. The actual perceived depth of the subtitle element with such a disparity is
dependent both on the display screen size and on the image resolution. For a 2K
resolution image with an image width of 2048 pixels that is displayed on a screen
with a seventy feet width, the subtitle element with a disparity of twenty pixels can
appear to be approximately fourteen feet away from the audience.
[0048] The subtitle can be located in front of the closest object in a 3D image at
the position of the subtitle element by a fixed amount, which may be a fixed number
of additional disparity. For example, if the closest image object is ten feet from the
audience, the subtitle element can be placed with four pixels of additional disparity to
each eye with a total additional disparity of eight pixels, which effectively places the
subtitle approximately two feet closer to the audience than the image object. Since
images of a 3D motion picture exhibit a constantly changing depth, the depth of the
subtitle may change following the depth of image content and may remain in front of
the closest object at the position of the subtitle element in the image. In some
embodiments, the additional disparity can be in a range of 1 pixel to 20 pixels for
images with a width of 2048 pixels, or in a range of 1 pixel to 40 pixels for images
with a width of 4096 pixels. The depth of image objects may be computed using a
stereo matching method or other suitable methods.
[0049] In some embodiments, stereo matching methods can be used to compute
the pixel disparity of 3D images. Typically, a subtitle element appears on the screen
when a person begins to speak, or shortly thereafter, and disappears when the
person stops speaking. An average duration of display for a subtitle element is a few
seconds, but it can be much longer or shorter under certain circumstances. During
display of a subtitle element, many frames of images are projected on the screen,
and these images may contain temporally changing content, such as object motion,
lighting change, scene dissolve and scene cuts.
[0050] According to some embodiments of the present invention, a proxy depth
value for a subtitle element is computed by analyzing all 3D image frames within a
temporal window that corresponds to the duration of the subtitle element. The proxy
depth value for a subtitle element may be constant or may vary from frame to frame
over the duration of the subtitle. The proxy depth value can be associated with the
subtitle element and can be a representative value for that subtitle element. The
actual depth placement of a subtitle element may be determined based on the
computed proxy depth value. Each subtitle element in a 3D motion picture can be
placed in a depth as determined by the proxy depth which is adaptive to image
content.
[0051] Content adaptive methods according to some embodiments can be
extended to other attributes of subtitles, including but not limited to subtitle font style,
font size, color, luminance and screen positions. Any type of attribute can be made
content adaptive to enhance the viewing experience of a 3D motion picture. An
appropriate method or a set of appropriate image analysis methods can be used to
determine the placement of each of the said attributes of subtitles.
[0052] The depth placement of a subtitle element can be produced by an
apparatus through the control of the horizontal positions of the left-eye view and the
right-eye view of the subtitle element displayed on a 3D screen. The depth
placement produced by the apparatus may or may not be identical to the proxy depth
computed. One example of such a difference is that the apparatus may have a
limited depth range and depth resolution. The same apparatus may also control the
other said content adaptive attributes of subtitles.
[0053] The attributes of conventional subtitles can be provided by a text-based
subtitle file. One type of information provided by a subtitle file may be the start time
and the end time of each subtitle element. Such timing information can be used to
determine a temporal window for computing the depth and other content adaptive
attributes of a subtitle element.
[0054] Figure 3 illustrates one embodiment of a system that can be used to
generate 3D subtitles or other information to be displayed with 3D images. The
system includes a computing device 302 having a processor 304 that can execute
code stored on a computer-readable medium, such as a memory 306, to cause the
computing device 302 to compute subtitle attributes or other information to be
displayed with 3D images. The computing device 302 may be any device that can
process data and execute code that is a set of instructions to perform actions.
Examples of the computing device 302 include a desktop personal computer, a
laptop personal computer, a server device, a handheld computing device, and a
mobile device.
[0055] Examples of the processor 304 include a microprocessor, an application-
specific integrated circuit (ASIC), a state machine, or other suitable processor. The
processor 304 may include one processor or any number of processors. The
processor 304 can access code stored in the memory 306 via a bus 308. The
memory 306 may be any tangible computer-readable medium capable of storing
code. The memory 306 can include electronic, magnetic, or optical devices, capable
of providing processor 304 with executable code. Examples of the memory 306
include random access memory (RAM), read-only memory (ROM), a floppy disk,
compact disc, digital video device, magnetic disk, an ASIC, a configured processor,
or other storage device capable of tangibly embodying code. The bus 308 may be
any device capable of transferring data between components of the computing
device 302. The bus 308 can include one device or multiple devices.
[0056] The computing device 302 can share data with additional components
through an input/output (I/O) interface 310. The I/O interface 310 can include a USB
port, an Ethernet port, a serial bus interface, a parallel bus interface, a wireless
connection interface, or any suitable interface capable of allowing data transfers
between the computing device and peripheral devices/networks 312. The peripheral
devices/networks 312 can include a keyboard, a display, a mouse device, a touch
screen interface, or other user interface device/output device capable of receiving
commands from a user and providing the commands to the computing device 302.
Other peripheral devices/networks 312 include the internet, an intranet, wide area
network (WAN), local area network (LAN), virtual private network (VPN), or any
suitable communications network that allows computing device 302 to communicate
with other components.
[0057] Instructions can be stored in the memory 306 as executable code. The
instructions can include processor-specific instructions generated by a compiler
and/or an interpreter from code written in any suitable computer-programming
language, such as C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, and
ActionScript. The instructions can be generated by software modules that are stored
in the memory 306 and, when executed by the processor 304, can cause the
computing device 302 to perform actions.
[0058] The software modules can include an image decoding module 314, a
temporal window selection module 316, an image abstraction module 318, an
abstract depth computing module 320, a proxy depth decision module 322, and a
render attribute computing module 324. The image decoding module 314 may be
used to decode left-eye image data and right-eye image data that are encoded or
encrypted to an uncompressed and non-encrypted format. The temporal window
selection module 316 can select a segment of the 3D image data for each subtitle
element based on the subtitle timing information in a subtitle file. The image
abstraction module 318 can simplify each 3D image segment into a pair of left and
right abstract images (e.g. one image from the left-eye image sequence and one
image from the right-eye image sequence). The abstract depth computing module
320 can compute an abstract depth map from the left and right abstract images. The
proxy depth decision module 322 can compute a proxy depth for a subtitle element
based on the abstract depth map. The render attribute computing module can
determine a render attribute for a subtitle element, based on the proxy depth for the
subtitle element and other image information, for example.
[0059] This exemplary system configuration is provided merely to illustrate a
potential configuration that can be used to implement certain embodiments. Other
configurations may of course be utilized.
[0060] Figure 4 illustrates one embodiment of a method for computing the
attributes for 3D subtitle elements based on the content of the 3D images. Although
the method shown in Figure 4 is described as applying to subtitles, the method can
apply to any type of information in addition to the 3D images. Furthermore, Figure 4
is described with reference to the system of Figure 3, but other implementations are
possible.
[0061] In block 402, a 3D image sequence is received by the computing device
302. The 3D image sequence can include a left-eye image sequence and a right-
eye image sequence that is associated with the left-eye image sequence. In some
embodiments, the 3D image sequence is received as an encoded file, such as a
Digital Cinema Package (DCP) file or an MPEG2 video file. The image decoding
module 314 can decode the encoded file to an uncompressed and non-encrypted file
format.
[0062] In block 404, the computing device 302 receives a subtitle file that includes
at least one subtitle element associated with timing information. The timing
information can correspond to timing information of the 3D motion picture. The
subtitle element can include text or other attributes or any other additional
information for display with the 3D image sequence. .
[0063] In block 406, the computing device 302 can associate the subtitle element
with a segment of the 3D image sequence based on the timing information. The
temporal window selection module 316 can select a segment of images from the 3D
sequence based on the timing information of the subtitle element. In some
embodiments, the temporal window selection module 316 can save computation
time by skipping sections of image sequences that are not associated with subtitles,
while processing the remaining sections. The image sequences may also be
partitioned into segments based on a limitation on the length of the image sequence.
Each segment can be associated with a subtitle element using timing information.
For example, each image segment is associated with a time window and can be
associated with subtitle elements having timing information that is within the time
window.
[0064] In block 408, the computing device 302 computes an abstract depth map
from the image segment associated with the subtitle element. An abstract depth
map may be a representation of depth values, or pixel disparity values, for image
frames or certain image frames of the segment. In some embodiments, the image
abstraction module 318 can simplify the segment into a pair of left and right abstract
images, one from the left-eye image sequence of the segment and one from the
right-eye image sequence of the segment. An abstract image may be a simplified
version of an image segment in which each image frame of the segment is reduced
to a single line of the abstract image by projecting each column of pixels of an image
frame into a single pixel. A left abstract image that is projected in such a way from a
left-eye image segment and a right abstract image that is projected from the
corresponding right-eye image segment forms an abstract image pair. The abstract
depth computing module 320 can compute the depth values, or the pixel disparity
values, of an abstract image pair and store the resulting depth information in an
abstract depth map. The abstract depth map can include depth values, or the pixel
disparity values, of all pixels or certain pixels of the abstract image pair..
[0065] In block 410, the computing device 302, computes a proxy depth based on
the abstract depth map for the subtitle element. A proxy depth may be a
representative depth for a subtitle element, and it may be a constant or a variable
value over the duration of the subtitle element. The proxy depth can represent
changes in depth over time in the 3D image sequences. In some embodiments, the
proxy depth decision module 322 computes a proxy depth for the subtitle element
that is a constant value or a value that changes over the duration of the subtitle
element.
[0066] In block 412, the computing device 302 uses the proxy depth to determine
a render attribute for the subtitle element. Examples of render attributes include
depth placement, font size, font color, position on screen and font style of 3D
subtitles as well as the color, size, position, and style of additional information, such
as images. In some embodiments, the render attribute computing module 324 uses
the proxy depth, which is based at least in part on the depth of content of an
associated 3D image sequence, to determine a render attribute that includes at least
one instruction for rendering the subtitle element. For example, the proxy depth may
be determined to be the render attribute of depth for the subtitle element, or used to
determine the render attribute of depth for the subtitle element.
[0067] In block 414, the computing device 302 outputs the render attribute for the
subtitle element. The render attribute can be used to render the subtitle element to
be displayed with the 3D image sequence.
[0068] The following describes additional embodiments of the modules and
features discussed above.
Image Abstraction
[0069] Embodiments of the image abstraction module 318 can perform various
functions such as simplifying a 3D image sequence into a pair of abstract images,
one for the left-eye and one for the right eye, through image projection. The
projection can be performed vertically so that each column of pixels in an image
frame is projected into a single pixel, and each frame is projected into a single line.
The projected lines from each of the image frames of the 3D image sequence can
form a pair of abstract images.
[0070] A graphical illustration of an embodiment of an image abstraction process
is depicted in Figure 5. A left-eye image sequence 502 is shown that includes N
frames, and each frame includes H lines. Each line includes W pixels. The left-eye
image sequence 502 can be projected into a left abstract image 506 with N lines,
with each line including W pixels. The first line of the left abstract image 506 can be
projected from the first frame of the left-eye image sequence, and the second line of
the left abstract image 506 can be projected from the second frame of the left-eye
image sequence, etc. The projected lines can form an WxN left abstract image 506.
Similarly, the right-eye image sequence 504 can be projected into a right abstract
image 508 with N lines and W pixels in each line. Both the left abstract image 506
and the right abstract image 508 form an abstract image pair.
[0071] In some embodiments, the projection is performed based on a vertical
sampling projection algorithm, an embodiment of which is depicted in Figure 6. The
position of a subtitle element can be pre-defined or specified in a subtitle file.
Subtitle elements can be centered near the bottom of an image frame, but other
positions are also possible. Figure 6 shows the subtitle element contained in a
subtitle region 604 of the kth left image frame 602 of an image sequence. A
sampling line 606 can be selected near or at the center of the subtitle region 604.
The pixels of each column of the kth left image frame 602 can be projected into a
single pixel towards the sampling line 606 to form the left abstract image 610. For
example, all, or substantially all, pixels of image column m 608 can be projected
towards point A on the sampling line, and projection can be performed so that the
pixels above the sampling line are projected downwards and pixels below the
sampling line are projected upwards. The result of projection can produce pixel B in
the left abstract image 610, at the location of (m, k).
[0072] The value of projected pixel B can be determined by a projection function
selected. The projection function can be selected to compress the original 3D image
sequences into a pair of abstract images, while preserving both depth information
and depth change information, in one embodiment, the projection function is based
on mathematical average. In another embodiment, the projection function is a
weighted average with higher weights assigned to pixels closer to the sampling line.
The projection process can be repeated for each column of image frame k, and the
result is the kth line 612 in the left abstract image 610. A similar projection method
can be applied to the right-eye image frame to produce a right abstract image (not
shown in Figure 6).
[0073] Another embodiment of the vertical sampling projection algorithm uses
multiple sampling lines, which can be a multiple vertical sampling projection
algorithm. An example of such an algorithm is depicted in Figure 7, in which a kth
left image frame 702 is divided into three regions: (i) a primary region 716 containing
the subtitle region 704 and two auxiliary regions, (ii) a top region 720, and (iii) a
center region 718.
[0074] A sampling line can be selected for each region. The sampling line
selected for the primary region 716 may be a primary sampling line 706 that can be
selected near or at the center of the subtitle region 704. The primary sampling line
can be assigned a primary role in a projection algorithm through appropriate weights
in the projection function, in one embodiment, pixels closer to the primary sampling
line are assigned to higher weights than those closer to auxiliary sampling lines. The
sampling line selected for an auxiliary region may be an auxiliary sampling line that
can be located at, but not restricted to, the center of the region. In the example
shown in Figure 7, the auxiliary sampling line 710 represents the depth change at
the top auxiliary region 720 of the image frame, and the auxiliary sampling line 708
represents the depth change at the center auxiliary region 718 of the image frame.
Vertical sampling projection can be performed within each region so that pixels are
vertically projected towards the sampling line of the region.
[0075] In the example shown in Figure 7, the pixels of the mth column 722 within
the primary region 716 are projected towards point A on the primary sampling line
706; the pixels of the same column within the region 718 are projected towards point
B on the auxiliary sampling line 708, and the remaining pixels of column m within the
top region 720 are projected towards point C on the auxiliary sampling line 710. In
some embodiments, the number of divided regions and the location of sampling lines
are determined based on number of factors including the position of the subtitle
region, the aspect ratio of 3D images, and theatre geometry. For example, more
sampling positions may be used for IMAX® 15perf/70mm image format with a
projection aspect ratio of 1.43:1 than a Scope image format with a projection aspect
ratio of 2.40:1. The projected values can be further combined in a format of
weighted average to produce the value at point D of line k 714 of the left abstract
image 712. A similar projection method can be applied to the right-eye image frame
to produce a right abstract image (not shown in Figure 7).
[0076] In another embodiment, a left or right image frame is divided into multiple
regions and each region is projected into a distinctive abstract image pair, as
depicted in Figure 8 for a left-eye image sequence. Vertical sampling projection
algorithms can be applied to each region of the left image sequence, and an abstract
image pair can be produced from each region, resulting in multiple abstract image
pairs that form an abstract image pair stack 812. The position of a sampling line for
each region can be selected based on the principles discussed previously. The
region that includes the subtitles may be assigned as a primary region 804 and can
produce a primary abstract image pair 816 (the right abstract image is not shown in
Figure 8). The other regions may be regarded as auxiliary regions 806, 808, and
each produces an auxiliary abstract image pair 818, 820 (the right abstract image is
not shown in Figure 8). As a result, the primary abstract image pair 816 can
describe depth changes in the vicinity of subtitles, while the other auxiliary abstract
image pairs 818, 820 can describe depth changes in designated regions. A similar
projection method can be applied to the right-eye image frame to produce multiple
right abstract images (not shown in Figure 8).
[0077] In another embodiment, an abstract image pair is projected from a
selected region of an image frame so that it may not have the full width of the image
frame. An example is depicted in Figure 9. Two selected regions of the kth image
frame can be identified for the left image sequence - one may be a primary region
906 that contains the subtitle region 904 and the second may be an auxiliary region
908 near the top of the images. The subtitle region 904 depicted has a width of
W1