Offload Reads And Writes

< Back

Offload Reads And Writes

Abstract: Aspects of the subject matter described herein relate to offload reads and writes. In aspects a requestor that seeks to transfer data sends a request for a representation of the data. In response the requestor receives one or more tokens that represent the data. The requestor may then provide one or more of these tokens to a component with a request to write data represented by the one or more tokens. In some exemplary applications the component may use the one or more tokens to identify the data and may then read the data or logically write the data without additional interaction with the requestor. Tokens may be invalidated by request or based on other factors.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

07 March 2013

Publication Number

40/2014

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

lsmds@lakshmisri.com

Parent Application

Applicants

MICROSOFT CORPORATION

One Microsoft Way Redmond Washington 98052 6399

Inventors

1. CHRISTIANSEN Neal R.

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

2. NAGAR Rajeev

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

3. GREEN Dustin L.

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

4. SADOVSKY Vladimir

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

5. SMITH Malcolm James

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

6. MEHRA Karan

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

Specification

OFFLOAD READS AND WRITES
BACKGROUND
[0001] One mechanism for transferring data is to read the data from a file of
a source location into main memory and write the data from the main memory to a
destination location. While in some environments, this may work acceptably for
relatively little data, as the data increases, the time it takes to read the data and
transfer the data to another location increases. In addition, if the data is accessed
over a network, the network may impose additional delays in transferring the data
from the source location to the destination location. Furthermore, security issues
combined with the complexity of storage arrangements may complicate data
transfer.
[0002] The subject matter claimed herein is not limited to embodiments that
solve any disadvantages or that operate only in environments such as those
described above. Rather, this background is only provided to illustrate one
exemplary technology area where some embodiments described herein may be
practiced.
SUMMARY
[0003] Briefly, aspects of the subject matter described herein relate to
offload reads and writes. In aspects, a requestor that seeks to transfer data sends a
request for a representation of the data. In response, the requestor receives one or
more tokens that represent the data. The requestor may then provide one or more
of these tokens to a component with a request to write data represented by the one
or more tokens. In some exemplary applications, the component may use the one
or more tokens to identify the data and may then read the data or logically write the
data without additional interaction with the requestor. Tokens may be invalidated
by request or based on other factors.
[0004] This Summary is provided to briefly identify some aspects of the
subject matter that is further described below in the Detailed Description. This
Summary is not intended to identify key or essential features of the claimed subject
matter, nor is it intended to be used to limit the scope of the claimed subject matter.
[0005] The phrase "subject matter described herein" refers to subject matter
described in the Detailed Description unless the context clearly indicates otherwise.
The term "aspects" is to be read as "at least one aspect." Identifying aspects of the
subject matter described in the Detailed Description is not intended to identify key
or essential features of the claimed subject matter.
[0006] The aspects described above and other aspects of the subject matter
described herein are illustrated by way of example and not limited in the
accompanying figures in which like reference numerals indicate similar elements
and in which:
BRIEF DESCRIPTION OF THE DRAWINGS
[0007] FIGURE 1 is a block diagram representing an exemplary generalpurpose
computing environment into which aspects of the subject matter described
herein may be incorporated;
[0008] FIGS. 2-5 are block diagrams that represent exemplary arrangements
of components of systems in which aspects of the subject matter described herein
may operate; and
[0009] FIGS. 6-8 are flow diagrams that generally represent exemplary
actions that may occur in accordance with aspects of the subject matter described
herein.
DETAILED DESCRIPTION
DEFINITIONS
[0010] As used herein, the term "includes" and its variants are to be read as
open-ended terms that mean "includes, but is not limited to." The term "or" is to be
read as "and/or" unless the context clearly dictates otherwise. The term "based on"
is to be read as "based at least in part on." The terms "one embodiment" and "an
embodiment" are to be read as "at least one embodiment." The term "another
embodiment" is to be read as "at least one other embodiment." Other definitions,
explicit and implicit, may be included below.
[0011] Sometimes herein the terms "first", "second", "third" and so forth are
used. The use of these terms, particularly in the claims, is not intended to imply an
ordering but is rather used for identification purposes. For example, the phrase
"first data" and "second data" does not necessarily mean that the first data is
located physically or logically before the second data or even that the first data is
requested or operated on before the second data. Rather, these phrases are used to
identify sets of data that are possibly distinct or non-distinct. That is, first data and
second data may refer to different data, the same data, some of the same data and
some different data, or the like. The first data may be a subset, potentially proper
subset, of the second data or vice versa.
[0012] Note, although the phrases "data of the store" and "data in the store"
are sometimes used herein, there is no intention in using these phrases to limit the
data mentioned to data that is physically stored on a store. Rather these phrases are
meant to limit the data to data that is logically in the store even if the data is not
physically in the store. For example, a storage abstraction (described below) may
perform an optimization wherein chunks of zeroes (or other data values) are not
actually stored on the underlying storage media but are rather represented by
shortened data (e.g., a value and length) that represents the zeros. Other examples
are provided below.
EXEMPLARY OPERATING ENVIRONMENT
[0013] Figure 1 illustrates an example of a suitable computing system
environment 100 on which aspects of the subject matter described herein may be
implemented. The computing system environment 100 is only one example of a
suitable computing environment and is not intended to suggest any limitation as to
the scope of use or functionality of aspects of the subject matter described herein.
Neither should the computing environment 100 be interpreted as having any
dependency or requirement relating to any one or combination of components
illustrated in the exemplary operating environment 100.
[0014] Aspects of the subject matter described herein are operational with
numerous other general purpose or special purpose computing system
environments or configurations. Examples of well-known computing systems,
environments, or configurations that may be suitable for use with aspects of the
subject matter described herein comprise personal computers, server computers,
hand-held or laptop devices, multiprocessor systems, microcontroller-based
systems, set-top boxes, programmable consumer electronics, network PCs,
minicomputers, mainframe computers, personal digital assistants (PDAs), gaming
devices, printers, appliances including set-top, media center, or other appliances,
automobile-embedded or attached computing devices, other mobile devices,
distributed computing environments that include any of the above systems or
devices, and the like.
[0015] Aspects of the subject matter described herein may be described in
the general context of computer-executable instructions, such as program modules,
being executed by a computer. Generally, program modules include routines,
programs, objects, components, data structures, and so forth, which perform
particular tasks or implement particular abstract data types. Aspects of the subject
matter described herein may also be practiced in distributed computing
environments where tasks are performed by remote processing devices that are
linked through a communications network. In a distributed computing
environment, program modules may be located in both local and remote computer
storage media including memory storage devices.
[0016] With reference to Figure 1, an exemplary system for implementing
aspects of the subject matter described herein includes a general-purpose
computing device in the form of a computer 110. A computer may include any
electronic device that is capable of executing an instruction. Components of the
computer 110 may include a processing unit 120, a system memory 130, and a
system bus 121 that couples various system components including the system
memory to the processing unit 120. The system bus 121 may be any of several
types of bus structures including a memory bus or memory controller, a peripheral
bus, and a local bus using any of a variety of bus architectures. By way of
example, and not limitation, such architectures include Industry Standard
Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA
(EISA) bus, Video Electronics Standards Association (VESA) local bus, Peripheral
Component Interconnect (PCI) bus also known as Mezzanine bus, Peripheral
Component Interconnect Extended (PCI-X) bus, Advanced Graphics Port (AGP),
and PCI express (PCIe).
[0017] The computer 110 typically includes a variety of computer-readable
media. Computer-readable media can be any available media that can be accessed
by the computer 110 and includes both volatile and nonvolatile media, and
removable and non-removable media. By way of example, and not limitation,
computer-readable media may comprise computer storage media and
communication media.
[0018] Computer storage media includes both volatile and nonvolatile,
removable and non-removable media implemented in any method or technology for
storage of information such as computer-readable instructions, data structures,
program modules, or other data. Computer storage media includes RAM, ROM,
EEPROM, flash memory or other memory technology, CD-ROM, digital versatile
discs (DVDs) or other optical disk storage, magnetic cassettes, magnetic tape,
magnetic disk storage or other magnetic storage devices, or any other medium
which can be used to store the desired information and which can be accessed by
the computer 110.
[0019] Communication media typically embodies computer-readable
instructions, data structures, program modules, or other data in a modulated data
signal such as a carrier wave or other transport mechanism and includes any
information delivery media. The term "modulated data signal" means a signal that
has one or more of its characteristics set or changed in such a manner as to encode
information in the signal. By way of example, and not limitation, communication
media includes wired media such as a wired network or direct-wired connection,
and wireless media such as acoustic, RF, infrared and other wireless media.
Combinations of any of the above should also be included within the scope of
computer-readable media.
[0020] The system memory 130 includes computer storage media in the form
of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and
random access memory (RAM) 132. A basic input/output system 133 (BIOS),
containing the basic routines that help to transfer information between elements
within computer 110, such as during start-up, is typically stored in ROM 131.
RAM 132 typically contains data and/or program modules that are immediately
accessible to and/or presently being operated on by processing unit 120. By way of
example, and not limitation, Figure 1 illustrates operating system 134, application
programs 135, other program modules 136, and program data 137.
[0021] The computer 110 may also include other removable/non-removable,
volatile/nonvolatile computer storage media. By way of example only, Figure 1
illustrates a hard disk drive 141 that reads from or writes to non-removable,
nonvolatile magnetic media, a magnetic disk drive 151 that reads from or writes to
a removable, nonvolatile magnetic disk 152, and an optical disc drive 155 that
reads from or writes to a removable, nonvolatile optical disc 156 such as a CD
ROM or other optical media. Other removable/non-removable, volatile/nonvolatile
computer storage media that can be used in the exemplary operating environment
include magnetic tape cassettes, flash memory cards, digital versatile discs, other
optical discs, digital video tape, solid state RAM, solid state ROM, and the like.
The hard disk drive 141 may be connected to the system bus 121 through the
interface 140, and magnetic disk drive 151 and optical disc drive 155 may be
connected to the system bus 121 by an interface for removable non-volatile
memory such as the interface 150.
[0022] The drives and their associated computer storage media, discussed
above and illustrated in Figure 1, provide storage of computer-readable
instructions, data structures, program modules, and other data for the computer 110.
In Figure 1, for example, hard disk drive 141 is illustrated as storing operating
system 144, application programs 145, other program modules 146, and program
data 147. Note that these components can either be the same as or different from
operating system 134, application programs 135, other program modules 136, and
program data 137. Operating system 144, application programs 145, other program
modules 146, and program data 147 are given different numbers from their
corresponding counterparts in the RAM 132 to illustrate that, at a minimum, they
are different copies.
[0023] A user may enter commands and information into the computer 110
through input devices such as a keyboard 162 and pointing device 161, commonly
referred to as a mouse, trackball, or touch pad. Other input devices (not shown)
may include a microphone, joystick, game pad, satellite dish, scanner, a touchsensitive
screen, a writing tablet, or the like. These and other input devices are
often connected to the processing unit 120 through a user input interface 160 that is
coupled to the system bus, but may be connected by other interface and bus
structures, such as a parallel port, game port or a universal serial bus (USB).
[0024] A monitor 191 or other type of display device is also connected to the
system bus 121 via an interface, such as a video interface 190. In addition to the
monitor, computers may also include other peripheral output devices such as
speakers 197 and printer 196, which may be connected through an output
peripheral interface 195.
[0025] The computer 110 may operate in a networked environment using
logical connections to one or more remote computers, such as a remote computer
180. The remote computer 180 may be a personal computer, a server, a router, a
network PC, a peer device or other common network node, and typically includes
many or all of the elements described above relative to the computer 110, although
only a memory storage device 181 has been illustrated in Figure 1. The logical
connections depicted in Figure 1 include a local area network (LAN) 171 and a
wide area network (WAN) 173, but may also include other networks. Such
networking environments are commonplace in offices, enterprise-wide computer
networks, intranets, and the Internet.
[0026] When used in a LAN networking environment, the computer 110 is
connected to the LAN 171 through a network interface or adapter 170. When used
in a WAN networking environment, the computer 110 may include a modem 172
or other means for establishing communications over the WAN 173, such as the
Internet. The modem 172, which may be internal or external, may be connected to
the system bus 121 via the user input interface 160 or other appropriate mechanism.
In a networked environment, program modules depicted relative to the computer
110, or portions thereof, may be stored in the remote memory storage device. By
way of example, and not limitation, Figure 1 illustrates remote application
programs 185 as residing on memory device 181. It will be appreciated that the
network connections shown are exemplary and other means of establishing a
communications link between the computers may be used.
Offload Reads and Writes
[0027] As mentioned previously, some traditional data transfer operations
may not be efficient or even work in today's storage environments.
[0028] FIGS. 2-5 are block diagrams that represent exemplary arrangements
of components of systems in which aspects of the subject matter described herein
may operate. The components illustrated in FIGS. 2-5 are exemplary and are not
meant to be all-inclusive of components that may be needed or included. In other
embodiments, the components and/or functions described in conjunction with
FIGS. 2-5 may be included in other components (shown or not shown) or placed in
subcomponents without departing from the spirit or scope of aspects of the subject
matter described herein. In some embodiments, the components and/or functions
described in conjunction with FIGS. 2-5 may be distributed across multiple
devices.
[0029] Turning to FIG. 2, the system 205 may include a requestor 210, data
access components 215, a token manager 225, a store 220, and other components
(not shown). The system 205 may be implemented via one or more computing
devices. Such devices may include, for example, personal computers, server
computers, hand-held or laptop devices, multiprocessor systems, microcontrollerbased
systems, set-top boxes, programmable consumer electronics, network PCs,
minicomputers, mainframe computers, cell phones, personal digital assistants
(PDAs), gaming devices, printers, appliances including set-top, media center, or
other appliances, automobile-embedded or attached computing devices, other
mobile devices, distributed computing environments that include any of the above
systems or devices, and the like.
[0030] Where the system 205 comprises a single device, an exemplary
device that may be configured to act as the system 205 comprises the computer 110
of FIG. 1. Where the system 205 comprises multiple devices, one or more of the
multiple devices may comprise a similarly or differently configured computer 110
of FIG. 1.
[0031] The data access components 215 may be used to transmit data to and
from the store 220. The data access components 215 may include, for example,
one or more of: I/O managers, filters, drivers, file server components, components
on a storage area network (SAN) or other storage device, and other components
(not shown). A SAN may be implemented, for example, as a device that exposes
logical storage targets, as a communication network that includes such devices, or
the like.
[0032] In one embodiment, a data access component may comprise any
component that is given an opportunity to examine I/O between the requestor 210
and the store 220 and that is capable of changing, completing, or failing the I/O or
performing other or no actions based thereon. For example, where the system 205
resides on a single device, the data access components 215 may include any object
in an I/O stack between the requestor 210 and the store 220. Where the system 205
is implemented by multiple devices, the data access components 215 may include
components on a device that hosts the requestor 210, components on a device that
provides access to the store 220, and/or components on other devices and the like.
In another embodiment, the data access components 215 may include any
components (e.g., such as a service, database, or the like) used by a component
through which the I/O passes even if the data does not flow through the used
components.
[0033] As used herein, the term component is to be read to include all or a
portion of a device, a collection of one or more software modules or portions
thereof, some combination of one or more software modules or portions thereof and
one or more devices or portions thereof, and the like.
[0034] In one embodiment, the store 220 is any storage media capable of
storing data. The store 220 may include volatile memory (e.g., a cache) and non
volatile memory (e.g., a persistent storage). The term data is to be read broadly to
include anything that may be represented by one or more computer storage
elements. Logically, data may be represented as a series of 1' s and 0's in volatile
or non-volatile memory. In computers that have a non-binary storage medium, data
may be represented according to the capabilities of the storage medium. Data may
be organized into different types of data structures including simple data types such
as numbers, letters, and the like, hierarchical, linked, or other related data types,
data structures that include multiple other data structures or simple data types, and
the like. Some examples of data include information, program code, program state,
program data, commands, other data, or the like.
[0035] The store 220 may comprise hard disk storage, solid state, or other
non-volatile storage, volatile memory such as RAM, other storage, some
combination of the above, and the like and may be distributed across multiple
devices (e.g., multiple SANs, multiple file servers, a combination of heterogeneous
devices, and the like). The devices used to implement the store 220 may be located
physically together (e.g., on a single device, at a datacenter, or the like) or
distributed geographically. The store 220 may be arranged in a tiered storage
arrangement or a non-tiered storage arrangement. The store 220 may be external,
internal, or include components that are both internal and external to one or more
devices that implement the system 205. The store 220 may be formatted (e.g., with
a file system) or non-formatted (e.g., raw).
[0036] In another embodiment, the store 220 may be implemented as a
storage abstraction rather than as direct physical storage. A storage abstraction
may include, for example, a file, volume, disk, virtual disk, logical unit, data
stream, alternate data stream, metadata stream, or the like. For example, the store
220 may be implemented by a server having multiple physical storage devices. In
this example, the server may present an interface that allows a data access
component to access data of a store that is implemented using one or more of the
physical storage devices or portions thereof of the server.
[0037] This level of abstraction may be repeated to any arbitrary depth. For
example, the server providing a storage abstraction to the data access components
215 may also rely on a storage abstraction to access and store data.
[0038] In another embodiment, the store 220 may include a component that
provides a view into data that may be persisted or non-persisted in non-volatile
storage.
[0039] One or more of the data access components 215 may reside on an
apparatus that hosts the requestor 210 while one or more other of the data access
components 215 may reside on an apparatus that hosts or provides access to the
store 220. For example, if the requestor 210 is an application that executes on a
personal computer, one or more of the data access components 215 may reside in
an operating system hosted on the personal computer. As another example, if the
store 220 is implemented by a storage area network (SAN), one or more of the data
access components 215 may implement a storage operating system that manages
and/or provides access to the store 220. When the requestor 210 and the store 220
are hosted in a single apparatus, all or many of the data access components 215
may also reside on the apparatus.
[0040] To initiate an offload read (described below) of data of the store 220,
the requestor 210 may send a request to obtain a token representing the data using a
predefined command (e.g., via an API). In response, one or more of the data access
components 215 may respond to the requestor 210 by providing one or more tokens
that represents the data or a subset thereof.
[0041] For example, for various reasons it may be desirable to return a token
that represents less data than the originally requested data. When a token is
returned, it may be returned with a length or even multiple ranges of data that the
token represents. The length may be smaller than the length of data originally
requested.
[0042] One or more of the data access components 215 may operate on less
than the requested length associated with a token on either an offload read or
offload write. The length of data actually operated on is sometimes referred to
herein as the "effective length." Operating on less than the requested length may
be desirable for various reasons. The effective length may be returned so that the
requestor or other data access components are aware of how many bytes were
actually operated on by the command.
[0043] The data access components 215 may act in various ways in response
to an offload read or write including, for example:
[0044] 1. A partitioning data access component may adjust the offset of
the offload read or write request before forwarding the request to the next lower
data access component.
[0045] 2. A RAID data access component may split the offload read or
write request and forward the pieces to the same or different data access
components. In the case of RAID-0, a received request may be split along the
stripe boundary (resulting in a shorter effective length) whereas in the case of
RAID-1, the entire request may be forwarded to more than one data access
components (resulting in multiple tokens for the same data).
[0046] 3. A caching data access component may write out parts of its
cache that include the data that is about to be obtained by the offload read request.
[0047] 4. A caching data access component may invalidate those parts of
its cache that include the data that is about to be overwritten by an offload write
request.
[0048] 5. A data verification data access component may invalidate any
cached checksums of the data that are about to be overwritten by the offload write
request.
[0049] 6. An encryption data access component may fail an offload read
or write request.
[0050] 7. A snapshot data access component may copy the data in the
location that is about to overwritten by the offload write request. This may be
done, in part, so that the user can later 'go back' to a 'previous version' of that file
if necessary. The snapshot data access component may itself use offload read and
write commands to copy the data in the location (that is about to be overwritten) to
a backup location. In this example, the snapshot data access component may be
considered a "downstream requestor" (described below).
[0051] The examples above are not intended to be all-inclusive or
exhaustive. Based on the teachings herein, those skilled in the art may recognize
other scenarios in which the teachings herein may be applied without departing
from the spirit or scope of aspects of the subject matter described herein.
[0052] If a data access component 215 fails an offload read or write, an error
code may be returned that allows another data access component or the requestor to
attempt another mechanism for reading or writing the data. Capability discovery
may be performed during initialization, for example. When a store or even lower
layer data access components do not support a particular operation, other actions
may be performed by an upper data access component or a requestor to achieve the
same result. For example, if a storage system (described below) does not support
offload reads and writes, a data access component may manage tokens and
maintain a view of the data such that upper data access components are unaware
that the store or lower data access component does not provide this capability.
[0053] A requestor may include an originating requestor or a downstream
requestor. For example, a requestor may include an application that requests a
token so that the application can perform an offload write. This type of requestor
may be referred to as an originating requestor. As another example, a requestor
may include a server application (e.g., such as a Server Message Block (SMB)
server) that has received a copy command from a client. The client may have
requested that data be copied from a source store to a destination store via a copy
command. The SMB server may receive this request and in turn use offload reads
and writes to perform the copy. In this case, the requestor may be referred to as a
downstream requestor.
[0054] As used herein, unless specified otherwise or clear from the context,
the term requestor is to be read to include both an originating requestor and a
downstream requestor. An originating requestor is a requestor that originally sent a
request for an offload read or write. In other words, the term requestor is intended
to cover cases in which there are additional components above the requestor to
which the requestor is responding to initiate an offload read as well as cases in
which the requestor is originating the offload read or write on its own initiative.
[0055] For example, an originating requestor may be an application that
desires to transfer data from a source to a destination. This type of originating
requestor may send one or more offload read and write requests to the data access
components 215 to transfer the data.
[0056] A downstream requestor is a requestor that issues one or more offload
reads or writes to satisfy a request from another requestor. For example, one or
more of the data access components 215 may act as a downstream requestor and
may initiate one or more offload reads or writes to fulfill requests made from
another requestor. Some examples of downstream requestors have been given
above in reference to RAID-0, partitioning, and snapshot data access components
although these examples are not intended to be all-inclusive or exhaustive.
[0057] In one embodiment, a token comprises a random or pseudo random
number that is difficult to guess. The difficulty of guessing the number may be
selected by the size of the number as well as the mechanism used to generate the
number. The number represents data on the store 220 but may be much smaller
than the data. For example, a requestor may request a token for a 100 Gigabyte
file. In response, the requestor may receive, for example, a 512 byte or other sized
token.
[0058] As long as the token is valid, the token represents the data. In some
implementations, the token may represent the data as it logically existed when the
token was bound to the data. The term logically is used as the data may not all
reside in the store or even be persisted. For example, some of the data may be in a
cache that needs to be flushed before the token can be provided. As another
example, some of the data may be derived from other data. As another example,
data from disparate sources may need to be combined or otherwise manipulated to
create the data represented by the token. The binding may occur after a request for
a token is received and before or at the time the token is returned.
[0059] In other implementations, the data represented by the token may
change. The behavior of whether the data may change during the validity of the
token may be negotiated with the requestor or between components. This is
described in more detail below.
[0060] A token may expire and thus become invalidated or may be explicitly
invalidated before expiring. For example, if a file represented by the token is
closed, the computer hosting the requestor 210 is shut down, a volume having data
represented by the token is dismounted, the intended usage of the token is
complete, or the like, a message may be sent to explicitly invalidate the token.
[0061] In some implementations, the message to invalidate the token may be
treated as mandatory and followed. In other implementations, the message to
invalidate the token may be treated as a hint which may or may not be followed.
After the token is invalidated, it may no longer be used to access data.
[0062] A token may be protected by the same security mechanisms that
protect the data the token represents. For example, if a user has rights to open and
read a file, this may allow the user to obtain a token that allows the user to copy the
file elsewhere. If a channel is secured for reading the file, the token may be passed
via a secured channel. If the data may be provided to another entity, the token may
be passed to the other entity just as the data could be. The receiving entity may use
the token to obtain the data just as the receiving entity could have used the data
itself were the data itself sent to the receiving entity.
[0063] The token may be immutable. That is, if the token is changed in any
way, it may no longer be usable to access the data the token represented.
[0064] In one embodiment, only one token is provided that represents the
data. In another embodiment, however, multiple tokens may be provided that each
represents portions of the data. In yet another embodiment, portions or all of the
data may be represented by multiple tokens. These tokens may be encapsulated in
another data structure or provided separately.
[0065] In the encapsulated case, a non-advanced requestor may simply pass
the data structure back to a data access component when the requestor seeks to
perform an operation (e.g., offload write, token invalidation) on the data. A more
advanced requestor 210 may be able to re-arrange tokens in the encapsulated data
structure, use individual tokens separately from other tokens to perform data
operations, or take other actions when multiple tokens are passed back.
[0066] After receiving a token, the requestor 210 may request that all or
portions of the data represented by the token be logically written. Sometimes
herein this operation is called an offload write. The requestor 210 may do this by
sending the token together with one or more offsets and lengths to the data access
components 215.
[0067] For an offload write, for each token involved, a token-relative offset
may be indicated as well as a destination-relative offset. Either or both offsets may
be implicit or explicit. A token-relative offset may represent a number of bytes (or
other units) from the beginning of data represented by the token, for example. A
destination-relative offset may represent the number of bytes (or other units) from
the beginning of data on the destination. A length may indicate a number of bytes
(or other units) to copy starting at the offset.
[0068] One or more of the data access components 215 may receive the
token, verify that the token represents data on the store, and if so logically write the
portions of data represented by the token according to the capabilities of a storage
system that hosts the underlying store 220. The storage system that hosts the
underlying store 220 may include one or more SANs, dedicated file servers, general
servers or other computers, network appliances, any other devices suitable for
implementing the computer 110 of FIG. 1, and the like.
[0069] For example, if the store 220 is hosted via a storage system such as a
SAN and the requestor 210 is requesting an offload write to the SAN using a token
that represents data that exists on the SAN, the SAN may utilize a proprietary
mechanism of the SAN to logically write the data without making another physical
copy of the data. For example, reference counting or another mechanism may be
used to indicate the number of logical copies of the data. For example, reference
counts may be used at the block level where a block may be logically duplicated on
the SAN by increasing a reference count of the block.
[0070] As another example, the store 220 may be hosted via a storage system
such as a file server that may have other mechanisms useful in performing an
offload write such that the offload write does not involve physically copying the
data.
[0071] As yet another example, the store 220 may be hosted via a "dumb"
storage system that physically copies the data from one location to another location
of the storage system in response to an offload write.
[0072] The examples above are not intended to be all-inclusive or
exhaustive. Indeed, from the point of view of a requestor, it may be irrelevant how
the storage system implements a data transfer corresponding to the offload write.
[0073] As noted previously, the data transfer operation of the storage system
may be time delayed. In some scenarios the data transfer operation may not occur
at all. For example, the storage system may quickly respond that an offload write
has completed but may receive a command to trim the underlying store before the
storage system has actually started the data transfer. In this case, the data transfer
operation at the storage system may be cancelled.
[0074] The requestor 210 may share the token with one or more other
entities. For example, the requestor may send the token to an application hosted on
an apparatus external to the apparatus upon which the requestor 210 is hosted. This
application may then use the token to write data in the same manner that the
requestor 210 could have. This scenario is illustrated in FIG. 5.
[0075] Turning to FIG. 5, using the data access components 215, the
requestor 210 requests and obtains a token representing data on the store 220. The
requestor 210 then passes this token to the requestor 510. The requestor 510 may
then write the data by sending the token via the data access components 515.
[0076] One or more of the data access components 215 and 515 may be the
same. For example, if the requestors 210 and 510 are hosted on the same
apparatus, all of the data access components 215 and 515 may be the same for both
requestors. If the requestors 210 and 510 are hosted on different apparatuses, some
components may be the same (e.g., components that implement an apparatus
hosting or providing access to the store 220) while other components may be
different (e.g., components on the different apparatuses).
[0077] Returning to FIG. 2, in one embodiment, one or more of the data
access components 215 may include or consult with a token manager (e.g., such as
the token manager 225). A token manager may include one or more components
that may generate or obtain tokens that represent the data on the store 220, provide
these tokens to an authorized requestor, respond to requests to write data using the
tokens, and determine when to invalidate a token. As described in more detail
below, a token manager may be distributed across multiple devices such that
logically the same token manager is used both to obtain a token in an offload read
and use the token in an offload write. In this case, distributed components of the
token manager may communicate with each other to obtain information about
tokens as needed. In one embodiment, a token manager may generate tokens, store
the tokens in a token store that associates the tokens with data on the store 220, and
verify that tokens received from requestors are found in the token store.
[0078] The token manager 225 may associate tokens with data that identifies
where the data may be found. This data may also be used where the token manager
225 is distributed among multiple devices to obtain token information (what data
the token represents, if the token has expired, other data, and the like) from
distributed components of the token manager 225. The token manager 225 may
also associate a token with a length of the data to ensure, in part, that a requestor is
not able to obtain data past the end of the data associated with a token.
[0079] If data on the store 220 is changed or deleted, the token manager 225
may take various actions, depending on how the token manager 225 is configured.
For example, if configured to preserve the data represented by a token, the token
manager 225 may ensure that a copy of the data that existed at the time the token
was generated is maintained. Some storage systems may have sophisticated
mechanisms for maintaining such copies even when the data has changed. In this
case, the token manager 225 may instruct the storage system (of which the store
220 may be part) to maintain a copy of the original data for a period of time or until
instructed otherwise.
[0080] In other cases, a storage system may not implement a mechanism for
maintaining a copy of the original data. In this case, the token manager 225 or
another of the data access components 215 may maintain a copy of the original data
for a period of time or until instructed otherwise.
[0081] Note that maintaining a copy of the original data may involve
maintaining a logical copy rather than a duplicate copy of the original data. A
logical copy includes data that may be used to create the exact copy. For example,
a logical copy may include a change log together with the current state of the data.
By applying the change log in reverse to the current state, the original copy may be
obtained. As another example, copy-on-write techniques may be used to maintain a
logical copy that can be used to reconstruct the original data. The examples above
are not intended to be limiting as it will be understood by those skilled in the art
that there are many ways in which a logical copy could be implemented without
departing from the spirit or scope of aspects of the subject matter described herein.
[0082] The token manager 225 may be configured to invalidate the token
when the data changes. In this case, in conjunction with allowing data associated
with the token to change, the token manager 225 may indicate that the token is no
longer valid. This may be done, for example, by deleting or marking the token as
invalid in the token store. If the token manager 225 is implemented by a
component of the storage system, one or more failure codes may be passed to one
or more other data access components and passed to the requestor 210.
[0083] The token manager 225 may manage expiration of the token. For
example, a token may have a time to live. After the time to live has expired, the
token may be invalidated. In another embodiment, the token may remain valid
depending on various factors including:
[0084] 1. Storage constraints. Maintaining original copies of the data
may consume space over a threshold. At that point, one or more tokens may be
invalidated to reclaim the space.
[0085] 2. Memory constraints. The memory consumed by maintaining
multiple tokens may exceed a threshold. At that point, one or more tokens may be
invalidated to reclaim memory space.
[0086] 3. Number of tokens. A system may allow a set number of active
tokens. After the maximum number of tokens is reached, the token manager may
invalidate an existing token prior to providing another token.
[0087] 4. Input/Output (IO) overhead. The IO overhead of having too
many tokens may be such that a token manager may invalidate one or more tokens
to reduce IO overhead.
[0088] 5. IO Cost/Latency. A token may be invalidated based on cost
and/or latency of a data transfer from source to destination. For example, if the
cost exceeds a threshold the token may be invalidated. Likely, if the latency
exceeds a threshold, the token may be invalided.
[0089] 6. Priority. Certain tokens may have priority over other tokens.
If a token is to be invalidated, a lower priority token may be invalidated. The
priority of tokens may be adjusted based on various policies (e.g., usage, explicit or
implicit knowledge about token, request by requestor, other policies, or the like).
[0090] 7. Storage provider request. A storage provider (e.g., SAN) may
request a reduction in number of active tokens. In response, the token manager
may invalidate one or more tokens as appropriate.
[0091] A token may be invalidated at any time before or even after one or
more offload writes based on the token have succeeded.
[0092] In one embodiment, a token includes only a value that represents the
data. In another embodiment, a token may also include or be associated with other
data. This other data may include, for example, data that can be used to determine
a storage device, storage system, or other entity from which the data may be
obtained, identification information of a network storage system, routing data and
hints, information regarding access control mechanisms, checksums regarding the
data represented by the token, type of data (e.g., system, metadata, database, virtual
hard drive, and the like), access patterns of the data (e.g., sequential, random),
usage patterns (e.g., often, sometimes, rarely accessed and the like), desired
alignment of the data, data for optimizing placement of the data during offload
write (e.g., in hybrid environments with different types of storage devices), and the
like.
[0093] The above examples are not intended to be all-inclusive or exhaustive
of the other data that may be included in or associated with a token. Indeed based
on the teachings herein, those skilled in the art may recognize other data that may
be conveyed with the token without departing from the spirit or scope of aspects of
the subject matter described herein.
[0094] A read/write request to a store may internally result in splitting of
read requests to lower layers of the storage stack as file fragment boundaries, RAID
stripe boundaries, volume spanning boundaries, and the like are encountered. This
splitting may occur because the source/destination differs across the split, or the
offset translation differs across the split. This splitting may be hidden by the
splitter by not completing a request that needs to be split until the resulting split IOs
are all completed.
[0095] This hiding of the splitting to within the splitting layer in the storage
stack is convenient in that the layers above in the storage stack do not need to know
about the splitting. With the token-based approach described herein, in one
embodiment, splitting may be visible. In particular, if splitting occurs due to
source/destination differing across the split, then the offload providers (described
below) may differ across the split. For example, where data is duplicated (or even
not duplicated), there may be multiple offload providers that provide access to the
data. As another example, there may be multiple file servers that front a SAN. In
addition to the SAN, one or more of the servers or other data access components
may be considered an offload provider.
[0096] An offload provider is a logical entity (possibly including multiple
components spread across multiple devices) that provides access to data associated
with a store—source or destination. Access as used herein may include reading
data, writing data, deleting data, updating data, a combination including two or
more of the above, and the like. Logically, an offload provider is capable of
performing an offload read or write. Physically, an offload provider may include
one or more of the data access components 215 and may also include the token
manager 225.
[0097] An offload provider may transfer data from a source store, write data
to a destination store, and maintain data to be provided upon receipt of a token
associated with the data. In some implementations, an offload provider may
indicate that an offload write command is completed after the data has been
logically written to the destination store. In addition, an offload provider may
indicate that an offload write command is completed but defer physically writing
data associated with the offload write until convenient.
[0098] When data is split, an offload provider may provide access to a
portion of the requested data, but not provide access to another portion of the
requested data. In this case, separate tokens may be provided for the portion before
the split point and the portion after the split point. Other implementationdependent
constraints in layers of the storage stack or in offload providers may
result in inability of a token to span across split ranges for other reasons. Because
the requestor may see the token(s) returned from a read, in this embodiment,
splitting may be visible to the requestor.
[0099] Following are two exemplary approaches to dealing with splitting:
[00100] 1. A read request may return more than one token where each
token is associated with a different range of the data requested. These multiple
tokens may be returned in a single data structure as mentioned previously. When
the requestor seeks to write data, it may pass the data structure as a whole or, if
acting in an advanced way, just one or more tokens in the data structure.
[00101] 2. If a single token is returned, the token may represent a
shortened range of the data originally requested. The requestor may then use the
token to perform one or more offload writes within the length limits of the
shortened range. When an offload write is requested, the length of the requested
write may also be truncated. For both reads and writes, a requestor may make a
request for another range starting at an offset not handled by a previous request. In
this manner, the requestor may work through the requestor's overall needed range.
[00102] The above approaches are exemplary only. Based on the teachings
herein, those skilled in the art may recognize other approaches for dealing with
splitting that may be utilized without departing from the spirit or scope of aspects
of the subject matter described herein.
[00103] There may be multiple offload providers in the same stack. For a
given range returned from an offload read request (possibly the only range, in the
case of range truncation), there may be multiple offload providers willing to
provide a token. In one embodiment, these multiple tokens for the same data may
be returned to a requestor and used by the requestor in an offload write.
[00104] For example, the requestor may select one of the tokens for use in an
offload write. By passing only one token to an offload provider the requestor may,
in this manner, determine the source offload provider that is used to obtain the data
from. In another example, the requestor may pass two or more of the tokens to a
destination offload provider. The destination offload provider may then select one
or more of the source offload providers associated with the tokens from which to
obtain the data represented by the tokens.
[00105] In another example, multiple tokens may be returned to enable both
offloaded copy of bulk data, and offloaded copying of other auxiliary data in
addition to bulk data. One example of auxiliary data is metadata regarding the data.
For example, a file system offload provider may specify that an offload write
request include two tokens (e.g., a primary data token and a metadata token) to
successfully be used on the destination stack in order for the overall offload copy to
succeed.
[00106] In contrast, multiple tokens used for the purpose of supporting
multiple bulk data offload providers in the stack may require that only one token be
used on the destination stack in order to for an offload write to succeed.
[00107] When multiple offload providers are available to transfer data from
the source to destination, the requestor may be able to select one or more specific
offload providers of the available ones. In one embodiment, this may involve using
a skip N command where "skip N" indicates skip the first N offload providers. In
another embodiment, there may be another mechanism used (e.g., an ID of the
offload provider) to identify the specific offload provider(s). In yet another
embodiment, selecting one of many tokens may be used to select the offload
provider(s) to copy the data as some offload providers may not be able to copy data
represented by the token while others may be able to do so.
[00108] In some embodiments, where more than one offload provider is
available to copy data represented by a token, the first, last, random, least loaded,
most efficient, lowest latency, or otherwise determined offload provider may be
automatically selected.
[00109] A token may represent data that begins at a certain sector of a hard
disk or other storage medium. The data the token represents may be an exact
multiple of sectors but in many cases will not be. If the token is used in a file
operation for data past the end of its length, the data returned may be null, 0, or
some other indication of no data. Thus, if a requestor attempts to copy past the end
of the data represented by the token, the requestor may not through this mechanism
obtain data that physically resides just past the end of the data.
[00110] A token may be used to offload the zeroing of a large file. For
example, a token may represent null, 0, or another "no data" file. By using this
token in an offload write, the token may be used to initialize a file or other data.
[00111] FIG. 3 is a block diagram that generally represents an exemplary
arrangement of components of systems in which a token manager is hosted by the
device that hosts the store. As illustrated the system 305 includes the requestor 210
and the store 220 of FIG. 2. The data access components 215 of FIG. 3 are divided
between the data access components 310 that reside on the device 330 that hosts the
requestor 210 and the data access components 315 that reside on the device 335
that hosts the store 220. In another embodiment, where the store 220 is external to
the device 335, there may be additional data access components that provide access
to the store 220.
[00112] The device 335 may be considered to be an offload provider as this
device includes the needed components for providing a token and writing data
given the token.
[00113] The token manager 320 may generate and validate tokens as
previously described. For example, when the requestor 210 asks for a token for
data on the store 220, the token manager 320 may generate a token that represents
the data. This token may then be sent back to the requestor 210 via the data access
components 310 and 315.
[00114] In conjunction with generating a token, the token manager 320 may
create an entry in the token store 325. This entry may associate the token with data
that indicates where on the store 220 the data represented by the token may be
found. The entry may also include other data used in managing the token such as
when to invalidate the token, a time to live for the token, other data, and the like.
[00115] When the requestor 2 10 or any other entity provides the token to the
token manager 320, the token manager may perform a lookup in the token store
325 to determine whether the token exists. If the token exists and is valid, the
token manager 320 may provide location information to the data access
components 315 so that these components may logically write the data as
requested.
[00116] Where multiple physical devices provide access to the store 220, the
token manager 320 and/or the token store 325 may have components that are
hosted by one or more of the physical devices. For example, the token manager
320 may replicate token state across devices, may have a centralized token
component that other token components consult, may have a distributed system in
which token state is provided from peer token managers on an as-needed basis, or
the like.
[00117] Logically, the token manager 320 manages tokens. Physically, the
token manager 320 may be hosted by a single device or may have components
distributed over two or more devices. The token manager 320 may be hosted on a
device that is separate from any devices that host the store 220. For example, the
token manager 320 may exist as a service that data access components 315 may call
to generate and validate tokens and provide location information associated
therewith.
[00118] In one embodiment, the token store 325 may be stored on the store
220. In another embodiment, the token store 325 may be separate from the store
220.
[00119] FIG. 4 is a block diagram that generally represents another exemplary
arrangement of components of systems that operates in accordance with aspects of
the subject matter described herein. As illustrated, the apparatus 405 hosts the
requestor 210 as well as data access components 310 and a virtualization layer 430.
The data access components 310 are arranged in a stacked manner and include N
components that include components 415, 420, 425, and other components (not
shown). The number N is variable and may vary from apparatus to apparatus.
[00120] The requestor 210 accesses one or more of the data access
components 310 via the application programming interface (API) 410. The
virtualization layer 430 indicates that the requestor or any of the data access
components may reside in a virtual environment.
[00121] A virtual environment is an environment that is simulated or
emulated by a computer. The virtual environment may simulate or emulate a
physical machine, operating system, set of one or more interfaces, portions of the
above, combinations of the above, or the like. When a machine is simulated or
emulated, the machine is sometimes called a virtual machine. A virtual machine is
a machine that, to software executing on the virtual machine, appears to be a
physical machine. The software may save files in a virtual storage device such as
virtual hard drive, virtual floppy disk, and the like, may read files from a virtual
CD, may communicate via a virtual network adapter, and so forth.
[00122] Files in a virtual hard drive, floppy, CD, or other virtual storage
device may be backed with physical media that may be local or remote to the
apparatus 405. The virtualization layer 430 may arrange data on the physical media
and provide the data to the virtual environment in a manner such that one or more
components accessing the data are unaware that they are accessing the data in a
virtual environment.
[00123] More than one virtual environment may be hosted on a single
computer. That is, two or more virtual environments may execute on a single
physical computer. To software executing in each virtual environment, the virtual
environment appears to have its own resources (e.g., hardware) even though the
virtual environments hosted on a single computer may physically share one or more
physical devices with each other and with the hosting operating system.
[00124] The source store 435 represents the store from which the requestor
210 is requesting a token. The destination store 440 represents the store to which
the requestor requests that data be written using the token. In implementation, the
source store 435 and the destination store 440 may be implemented as a single store
(e.g., a SAN with multiple volumes) or two or more stores. Where the source store
435 does not support maintaining a copy of the original data, one or more of the
components 415-425 may operate to maintain a copy of the original data during the
lifetime of the token.
[00125] When the source store 435 and the destination store 440 are
implemented as two separate stores, additional components (e.g., storage server or
other components) may transfer the data from the source store 435 to the
destination store 440 without involving the apparatus 405. In one embodiment,
however, even when the source store 435 and the destination store 440 are
implemented as two separate stores, one or more of the data access components 310
may act to copy data from the source store 435 to the destination store 440. The
requestor 210 may be aware or unaware, informed or non-informed, of how the
underlying copying is performed.
[00126] There may be multiple paths between the requestor 210 and the
source store 435 and/or the destination store 440. In one embodiment, the token
methodology described herein is independent of the path taken provided that
information indicating the data represented (e.g., available via the token manager)
is available. In other words, if the requestor 210 has a path that passes through the
virtualization layer 430, a network path that does not pass through the virtualization
layer 430, an SMB path, or any other path to the source or destination stores, the
requestor 210 may use one or more of these paths to issue an offload write to the
destination store 440. In other words, the path taken to the source store and the
path taken to the destination store may be the same or different.
[00127] In the offload write, the token is passed together with one or more
offsets and lengths of data to write to the destination store 440. A data access
component (not necessarily one of the data access components 310) receives the
token, uses the token to obtain location information from a token manager, and may
commence logically writing the data from the source store 435 to the destination
store 440.
[00128] One or more of the components 415-425 or another component (not
shown) may implement a token manager.
[00129] Following are some exemplary definitions of some data structures
that may be used with aspects of the subject matter described herein:
#define FSCTL OFFLOAD READ
CTL_CODE(FILE_DEVICE_FILE_SYSTEM, 153, METHOD BUFFERED,
FILE READ ACCESS) III 53 is used to indicate offload read
typedef struct FSCTL OFFLOAD READ INPUT {
ULONG Size;
ULONG Flags;
ULONG TokenTimeToLive; // (e.g., in milliseconds)
ULONG Reserved;
ULONGLONG FileOffset;
ULONGLONG CopyLength;
} FSCTL OFFLOAD READ INPUT, *PFSCTL_OFFLOAD_READ INPUT;
typedef struct FSCTL OFFLOAD READ OUTPUT {
ULONG Size;
ULONG Flags;
ULONGLONG TransferLength;
UCHAR Token[512]; // May be larger or smaller than 2
} FSCTL OFFLOAD READ OUTPUT,
*PFSCTL_OFFLOAD_READ_OUTPUT;
#define FSCTL OFFLOAD WRITE
CTL_CODE(FILE_DEVICE_FILE_SYSTEM, 154, METHOD BUFFERED,
FILE WRITE ACCESS) // 154 is used to indicate offload write
typedef struct FSCTL OFFLOAD WRITE INPUT {
ULONG Size;
ULONG Flags;
ULONGLONG FileOffset;
ULONGLONG CopyLength;
ULONGLONG TransferOffset;
UCHAR Token[5 12];
} FSCTL OFFLOAD WRITE INPUT, *PFSCTL_OFFLOAD_WPJTE INPUT;
typedef struct FSCTL OFFLOAD WRITE OUTPUT {
ULONG Size;
ULONG Flags;
ULONGLONG LengthWritten;
} FSCTL OFFLOAD WRITE OUTPUT,
*PFSCTL_OFFLOAD_WRITE_OUTPUT;
//
// This flag, when OR'd into an action indicates that the given action is
// non-destructive. If this flag is set then storage stack components which
// do not understand the action should forward the given request
//
#define DeviceDsmActionFlag NonDestructive 0x80000000
#define IsDsmActionNonDestructive( Action) ((BOOLEAN)((_Action &
DeviceDsmActionFlag NonDestructive) != 0))
typedef ULONG DEVICE DATA MANAGEMENT SET ACTION;
#define DeviceDsmAction_OffloadRead (3 |
DeviceDsmActionFlag NonDestructive)
#define DeviceDsmAction OffloadWrite 4
//
// Flags that are global across all actions
//
typedef struct DEVICE DATA SET RANGE {
LONGLONG StartingOffset; // e.g., in bytes
ULONGLONG LengthlnBytes; // e.g., multiple of sector size
} DEVICE DATA SET RANGE, *PDEVICE_DATA_SET_RANGE;
[00130] Exemplary IOCTL data structures for implementing aspects of the
subject matter described herein may be defined as follows:
//
// input structure for IOCTL STORAGE MANAGE DATA SET ATTRIBUTES
// 1. Value of ParameterBlockOffset or ParameterBlockLength is 0 indicates that
// Parameter Block does not exist.
// 2. Value of DataSetRangesOffset or DataSetRangesLength is 0 indicates that
// DataSetRanges Block does not exist. If DataSetRanges Block exists, it
contains
// contiguous DEVICE DATA SET RANGE structures.
// 3. The total size of buffer is at least:
// sizeof
(DEVICE_MANAGE_DATA_SET_ATTRIBUTES)+ParameterBlockLength+
DataSetRangesLength
typedef struct DEVICE MANAGE DATA SET ATTRIBUTES {
ULONG Size; // Size of structure
//
DEVICE MANAGE DATA SET ATTRIBUTES
DEVICE DATA MANAGEMENT SET ACTION Action;
ULONG Flags; // Global flags across all actions
ULONG ParameterBlockOffset; // aligned to corresponding
structure
// alignment
ULONG ParameterBlockLength; // 0 means Parameter Block
does not
// exist.
ULONG DataSetRangesOffset; // aligned to
//
DEVICE DATA SET RANGE
// structure alignment.
ULONG DataSetRangesLength; // 0 means DataSetRanges
Block
// does not exist.
} DEVICE MANAGE DATA SET ATTRIBUTES,
*PDEVICE_MANAGE_DATA_SET_ATTRIBUTES;
//
// Parameter structure definitions for copy offload actions
//
//
// Offload copy interface operates in 2 steps: offload read and offload write.
//
// Input for OffloadRead action is set of extents in DSM structure
// Output parameter of an OffloadRead is a token, returned by the target which will
// identify a "point in time" snapshot of extents taken by the target.
// Format of the token may be opaque to requestor and specific to the target.
//
// Note: a token length to 512 is exemplary. SCSI interface to OffloadCopy may
enable
// negotiable size. A new action may be created for variable-sized tokens.
#define DSM OFFLOAD MAX TOKEN LENGTH 512
// Keep as ULONG multiple
typedef struct DEVICE DSM OFFLOAD READ PARAMETERS {
ULONG Flags;
ULONG TimeToLive; // token Time to live (e.g., in milliseconds); may be
requested
// by requestor
} DEVICE DSM OFFLOAD READ PARAMETERS,
*PDEVICE_DSM_OFFLOAD_READ_PARAMETERS;
typedef struct DEVICE DSM OFFLOAD WRITE PARAMETERS {
ULONG Flags;
ULONG Reserved; // reserved for future usage
ULONGLONG TokenOffset; // The starting offset to copy from data represented
by token
UCHAR Token[DSM_OFFLOAD_MAX_TOKEN_LENGTH] ; // the
token
} DEVICE DSM OFFLOAD WRITE PARAMETERS,
*PDEVICE_DSM_OFFLOAD_WRITE_PARAMETERS;
typedef struct STORAGE OFFLOAD READ OUTPUT {
ULONG OffloadReadFlags; // Outbound flags
ULONG Reserved;
ULONGLONG LengthProtected; // The length of data represented by token,
from the
// lowest StartingOffset
ULONG TokenLength; // Length of the token in bytes.
UCHAR Token[DSM_OFFLOAD_MAX_TOKEN_LENGTH] ;
// The token created on success.
} STORAGE OFFLOAD READ OUTPUT,
*PSTORAGE_OFFLOAD_READ_OUTPUT;
//
// STORAGE OFFLOAD READ OUTPUT flag definitions
//
#define STORAGE OFFLOAD READ RANGE TRUNCATED (0x000 1)
typedef struct STORAGE OFFLOAD WRITE OUTPUT {
ULONG OffloadWriteFlags; // Out flags
ULONG Reserved; // reserved for future usage
ULONGLONG LengthCopied; // Out parameter : The length of content
copied from the
// start of the data represented by the token
} STORAGE OFFLOAD WRITE OUTPUT,
*PSTORAGE_OFFLOAD_WRITE_OUTPUT;
//
// STORAGE OFFLOAD WRITE OUTPUT flag definitions - used in
OffloadWriteFlags mask
//
// Write performed, but on a truncated range
#define STORAGE OFFLOAD WRITE RANGE TRUNCATED (0x0001)
//
// DSM output structure for bi-directional actions.
//
// Output parameter block is located in resultant buffer at the offset contained in
// OutputBlockOffset field. Offset is calculated from the beginning of the buffer,
// and callee will align it according to the requirement of the action specific
structure
// template.
// Example: for OffloadRead action in order to get a pointer to the output structure,
a caller
// shall
//
// PSTORAGE OFFLOAD READ OUTPUT pReadOut =
// (PSTORAGE OFFLOAD READ OUTPUT) ((UCHAR *)pOutputBuffer +
//
((PDEVICE_MANAGE_DATA_SET_ATTRIBUTES_OUTPUT)pOutputBuffer)
// ->OutputBlockOffset)
//
typedef struct DEVICE MANAGE DATA SET ATTRIBUTES OUTPUT {
ULONG Size; // Size of the structure
DEVICE DATA MANAGEMENT SET ACTION Action;
// Action requested and performed
ULONG Flags; // Common output flags for DSM actions
ULONG OperationStatus; // Operation status; used for offload actions
// (placeholder for richer semantic, like
PENDING)
ULONG ExtendedError; // Extended error information
ULONG TargetDetailedError; // Target specific error; may be used for
offload actions
// (SCSI sense code)
ULONG ReservedStatus; // Reserved field
ULONG OutputBlockOffset; // Action specific aligned to
corresponding structure
// alignment.
ULONG OutputBlockLength; // 0 means Output Parameter Block does
not exist.
} DEVICE MANAGE DATA SET ATTRIBUTES OUTPUT,
*PDEVICE MANAGE DATA SET ATTRIBUTES OUTPUT;
[00131] FIGS. 6-8 are flow diagrams that generally represent exemplary
actions that may occur in accordance with aspects of the subject matter described
herein. For simplicity of explanation, the methodology described in conjunction
with FIGS. 6-8 is depicted and described as a series of acts. It is to be understood
and appreciated that aspects of the subject matter described herein are not limited
by the acts illustrated and/or by the order of acts. In one embodiment, the acts
occur in an order as described below. In other embodiments, however, the acts
may occur in parallel, in another order, and/or with other acts not presented and
described herein. Furthermore, not all illustrated acts may be required to
implement the methodology in accordance with aspects of the subject matter
described herein. In addition, those skilled in the art will understand and appreciate
that the methodology could alternatively be represented as a series of interrelated
states via a state diagram or as events.
[00132] Turning to FIG. 6, at block 605, the actions begin. At block 610, a
request for a representation of data of the store is received. The request is
conveyed in conjunction with a description (e.g., location and length) that identifies
a portion of the store. Here, the word "portion" may be all or less than all of the
store. For example, referring to FIG. 2, the requestor 210 may request a token for
data on the store 220. In making the request, the requestor 210 may send a location
of the data (e.g., a file name, a handle to an open file, a physical offset into a file,
volume, or raw disk, or the like) together with a length.
[00133] At block 6 15, in response to the request, a token is received that
represents the data that was logically stored in the portion of the store when the
token is bound to the data. As mentioned previously, the token may represent less
data than requested. For example, referring to FIG. 2, one or more of the data
access components 215 may return a token to the requestor 210 that represents the
data requested or a subset thereof. The token may be a size (e.g., a certain number
of bits or bytes) that is independent of the size of the data represented by the token.
The token may be received together with other tokens in a data structure where
each token in the data structure is associated with a different portion of the data or
two or more tokens are associated with the same portion of the data.
[00134] Receiving the token may be accompanied by an indication that the
token represents data that is a subset of the data requested. This indication may
take the form, for example, of a length of the data represented by the token.
[00135] At block 620, the token is provided to perform an offload write. The
token may be provided along with information indicating whether to logically write
all or a portion of the data via an offload provider. This information may include,
for example, a destination-relative offset, a token-relative offset, and length. An
token-relative offset of 0 and length equal to the entire length of the data
represented by the token may indicate to copy all of the data while any offset with a
length less than the entire length of the data may indicate to copy less than the
entire data.
[00136] For example, referring to FIG. 2, the requestor may pass the token to
the data access components 215 that may pass the token to a token manager 225 to
obtain a location of the represented data. Where the token manager 225 is part of
the storage system providing access to the store 220 (e.g., in a SAN), the token may
be provided to a data access component of the SAN which may then use the token
to identify the data and logically write the data indicated by the request.
[00137] As mentioned previously, the offload provider may be external to the
apparatus sending the request. In addition, once the offload provider receives the
request, the offload provider may logically write the data independent of additional
interaction with any component of the apparatus sending the request. For example,
referring to FIG. 3, once the token and request to write reach the data access
components 315, the components of the device 335 may logically write the data as
requested without any additional assistance from the device 330.
[00138] At block 625, other actions, if any, may be performed. Note that at
block 630, at any time after the token has been generated, the requestor (or another
of the data access components) may explicitly request that the token be invalidated.
If this request is sent during the middle of a copy operation, in one implementation,
the copy may be allowed to proceed to completion. In another implementation, the
copy may be aborted, an error may be raised, or other actions may occur.
[00139] Turning to FIG. 7, at block 705, the actions begin. At block 710, a
request for a representation of data of a store is received. The request is conveyed
in conjunction with a description that identifies a portion of the store at which the
data is located. The request may be received at a component of a storage area
network or at another data access component. For example, referring to FIG. 3, one
or more of the data access components 315 may receive a request for a token
together with an offset, length, logical unit number, file handle, or the like that
identifies data on the store 220.
[00140] At block 715, a token is generated. The token generated may
represent data that was logically stored (e.g., in the store 220 of FIG. 3). As
mentioned previously, this data may be non-changing or allowed to change during
the validity of the token depending on implementation. The token may represent a
subset of the data requested as indicated previously. For example, referring to FIG.
3, the token manager 320 may generate a token to represent the data requested by
the requestor 210 on the store 220.
[00141] At block 720, the token is associated with the represented data via a
data structure. For example, referring to FIG. 3, the token manager 320 may store
an association in the token store 325 that associates the generated token with the
represented data.
[00142] At block 725, the token is provided to the requestor. For example,
referring to FIG. 3, the token manager or one of the data access components 315
may provide the token to the data access components 310 to provide to the
requestor 210. The token may be returned with a length that indicates the size of
data represented by the token.
[00143] At block 730, other actions, if any, may be performed. Note that at
block 735, at any time after the token has been generated, the token manager may
invalidate the token depending on various factors as described previously. If the
token is invalidated during a write operation affecting the data, in one
implementation, the write may be allowed to proceed to completion. In another
implementation, the write may be aborted, an error may be raised, or other actions
may occur.
[00144] FIG. 8 is a block diagram that generally represents exemplary actions
that may occur when an offload write is received at an offload provider in
accordance with various aspects of the subject matter described herein. At block
805, the actions begin.
[00145] At block 810, a token is received. The token may be received with
data that indicates whether to logically write all or some of the data represented by
the token. For example, referring to FIG. 3, one of the data access components 315
may receive a token from one of the data access components 310 of FIG. 3.
[00146] At block 815, a determination is made as to whether the token is
valid. For example, referring to FIG. 3, the token manager 320 may determine
whether the received token is valid by consulting the token store 325. If the token
is valid the actions continue at block 820; otherwise, the request may be failed and
the actions continue at block 817.
[00147] At block 817, the request is failed. For example, referring to FIG. 3,
the data access components 315 may indicate that the copy failed.
[00148] At block 820, the data requested by the offload copy is identified.
For example, referring to FIG. 3, the token manager 320 may consult the token
store 325 to obtain a location or other identifier of the data associated with the
token. The token may include or be associated with data that indicates an apparatus
that hosts the data represented by the token.
[00149] At block 825, a logical write of the data represented by the token is
performed. For example, referring to FIG. 3, the device 335 may logically write
the data represented by the token.
[00150] At block 830, other actions, if any, may be performed.
[00151] As can be seen from the foregoing detailed description, aspects have
been described related to offload reads and writes. While aspects of the subject
matter described herein are susceptible to various modifications and alternative
constructions, certain illustrated embodiments thereof are shown in the drawings
and have been described above in detail. It should be understood, however, that
there is no intention to limit aspects of the claimed subject matter to the specific
forms disclosed, but on the contrary, the intention is to cover all modifications,
alternative constructions, and equivalents falling within the spirit and scope of
various aspects of the subject matter described herein.
[00152] WHAT IS CLAIMED IS:
1. A method implemented at least in part by a computer, the method
comprising:
sending a request for a representation of first data of a store, the request
conveyed in conjunction with a description that identifies a portion of the store;
in response to the request, receiving a token that represents second data
logically stored in the portion of the store, the second data a subset, potentially a
proper subset, of the first data; and
providing the token together with information indicating to logically write
third data via an offload provider operable to use the token at least to locate the
third data, the third data a subset, potentially a proper subset, of the second data.
2. The method of claim 1, wherein sending a request that includes a
description of a portion of storage comprises sending an offset and length, the
offset representing a location of the first data in the store, the length representing a
size of the first data.
3. The method of claim 1, wherein receiving the token comprises
receiving a number usable to obtain the second data as the second data existed
when the token was bound to the second data, the number usable by the offload
provider to identify the second data, the number being generated by a random or
pseudo random mechanism.
4. The method of claim 1, wherein receiving the token comprises
receiving the token together with other tokens in a data structure, each token in the
data structure usable to obtain a different portion of the second data as the different
portion existed when the token was bound to the different portion.
5. The method of claim 1, further comprising receiving one or more
other tokens each of which also represents the second data and further comprising
providing one or more of the other tokens in conjunction with providing the token.
6. A computer storage medium having computer-executable
instructions, which when executed perform actions, comprising:
receiving , from a requestor, a request for a representation of first data
logically stored in a store, the request conveyed in conjunction with a description
that identifies a portion of the store at which the first data is located;
generating a token that represents second data logically stored in the portion
of the store, the second data a subset, potentially a proper subset, of the first data;
associating the token with the second data via a data structure, the token
usable to obtain the second data as the second data existed when the token was
bound to the second data; and
providing the token to the requestor.
7. The computer storage medium of claim 6, further comprising:
receiving the token together with third data that indicates whether to write
all or some of the second data;
determining if the token is valid;
if the token is not valid, failing the request.
8. The computer storage medium of claim 6, wherein receiving a request
for a representation of first data logically stored in a store comprises receiving the
request at a data access component of a storage area network device, wherein
generating a token that represents the second data comprises generating a value by
a component of the storage area network device, and wherein associating the token
with the second data via a data structure comprises placing an entry in a table, the
entry including the token and an identifier of the second data as the second data
existed at a time at or after the request is received at the data access component and
before or when the token is returned to the requestor.
9. The computer storage medium of claim 6, further comprising
receiving a request to change the first data and in response thereto invalidating the
token.
10. The computer storage medium of claim 6, further comprising
invalidating the token based on one or more of memory constraints, write activity,
disk constraints, network bandwidth constraints, latency constraints, and time to
live.
11. The computer storage medium of claim 6, further comprising
receiving a request to change the first data and in response thereto making the
change and maintaining a logical copy of the second data as it existed when the
token was bound to the second data.
12. In a computing environment, a system, comprising:
a requestor operable to send a request for a representation of first data of a
store, the requestor further operable to receive a token that represents second data
that is a subset, potentially a proper subset, of the first data, the requestor further
operable to provide the token together with third data that indicates to logically
write all or a portion of the second data;
a token manager operable to generate the token and to associate the token
with the second data via a data structure; and
an offload provider operable to receive the token together with the third
data, the offload provider further operable to consult the token manager to
determine whether the token is valid, the second data logically maintained as nonchanging
at least while the token is valid.
13. The system of claim 1 , wherein the offload provider is further
operable to logically write all or some of the second data as indicated by the third
data if the token is valid, the third data also including a destination in which to put
written data.
14. The system of claim 12, wherein the requestor comprises a
component of an apparatus that is external to an apparatus hosting the offload
provider.
15. The system of claim 12, wherein the token manager and the offload
provider are both hosted on an apparatus of a storage area network.

Documents

Application Documents

#	Name	Date
1	1875-CHENP-2013 POWER OF ATTORNEY 07-03-2013.pdf	2013-03-07
1	1875-CHENP-2013-AbandonedLetter.pdf	2019-12-16
2	1875-CHENP-2013-FER.pdf	2019-06-12
2	1875-CHENP-2013 FORM-5 07-03-2013.pdf	2013-03-07
3	FORM-6-1801-1900(JAYA).37.pdf	2015-03-13
3	1875-CHENP-2013 FORM-3 07-03-2013.pdf	2013-03-07
4	FORM-6-1801-1900(JAYA).37.pdf ONLINE	2015-03-09
4	1875-CHENP-2013 FORM-2 FIRST PAGE 07-03-2013.pdf	2013-03-07
5	MS to MTL Assignment.pdf ONLINE	2015-03-09
5	1875-CHENP-2013 FORM-1 07-03-2013.pdf	2013-03-07
6	MTL-GPOA - JAYA.pdf ONLINE	2015-03-09
6	1875-CHENP-2013 DRAWINGS 07-03-2013.pdf	2013-03-07
7	1875-CHENP-2013 DESCRIPTION (COMPLETE) 07-03-2013.pdf	2013-03-07
7	1875-CHENP-2013 FORM-6 01-03-2015.pdf	2015-03-01
8	abstract1875-CHENP-2013.jpg	2014-08-27
8	1875-CHENP-2013 CORRESPONDENCE OTHERS 07-03-2013.pdf	2013-03-07
9	1875-CHENP-2013 CORRESPONDENCE OTHERS 12-08-2013.pdf	2013-08-12
9	1875-CHENP-2013 CLAIMS SIGNATURE LAST PAGE 07-03-2013.pdf	2013-03-07
10	1875-CHENP-2013 CLAIMS 07-03-2013.pdf	2013-03-07
10	1875-CHENP-2013FORM-3 12-08-2013.pdf	2013-08-12
11	1875-CHENP-2013 CORRESPONDENCE OTHERS 10-05-2013.pdf	2013-05-10
11	1875-CHENP-2013 PCT PUBLICATION 07-03-2013.pdf	2013-03-07
12	1875-CHENP-2013.pdf	2013-03-14
13	1875-CHENP-2013 CORRESPONDENCE OTHERS 10-05-2013.pdf	2013-05-10
13	1875-CHENP-2013 PCT PUBLICATION 07-03-2013.pdf	2013-03-07
14	1875-CHENP-2013 CLAIMS 07-03-2013.pdf	2013-03-07
14	1875-CHENP-2013FORM-3 12-08-2013.pdf	2013-08-12
15	1875-CHENP-2013 CLAIMS SIGNATURE LAST PAGE 07-03-2013.pdf	2013-03-07
15	1875-CHENP-2013 CORRESPONDENCE OTHERS 12-08-2013.pdf	2013-08-12
16	1875-CHENP-2013 CORRESPONDENCE OTHERS 07-03-2013.pdf	2013-03-07
16	abstract1875-CHENP-2013.jpg	2014-08-27
17	1875-CHENP-2013 FORM-6 01-03-2015.pdf	2015-03-01
17	1875-CHENP-2013 DESCRIPTION (COMPLETE) 07-03-2013.pdf	2013-03-07
18	1875-CHENP-2013 DRAWINGS 07-03-2013.pdf	2013-03-07
18	MTL-GPOA - JAYA.pdf ONLINE	2015-03-09
19	1875-CHENP-2013 FORM-1 07-03-2013.pdf	2013-03-07
19	MS to MTL Assignment.pdf ONLINE	2015-03-09
20	FORM-6-1801-1900(JAYA).37.pdf ONLINE	2015-03-09
20	1875-CHENP-2013 FORM-2 FIRST PAGE 07-03-2013.pdf	2013-03-07
21	FORM-6-1801-1900(JAYA).37.pdf	2015-03-13
21	1875-CHENP-2013 FORM-3 07-03-2013.pdf	2013-03-07
22	1875-CHENP-2013-FER.pdf	2019-06-12
22	1875-CHENP-2013 FORM-5 07-03-2013.pdf	2013-03-07
23	1875-CHENP-2013-AbandonedLetter.pdf	2019-12-16
23	1875-CHENP-2013 POWER OF ATTORNEY 07-03-2013.pdf	2013-03-07

Search Strategy

1	search_10-06-2019.pdf