Claim Based Content Reputation Service

< Back

Claim Based Content Reputation Service

Abstract: In some embodiments a database may store a plurality of content claims for previously evaluated data items with each of the plurality of content claims being associated in the database with a corresponding stored digital fingerprint of a previously evaluated data item. One or more servers may be configured to receive a determined digital fingerprint of a data item from a client device on another network node to submit a query to the database using the determined digital fingerprint as a primary key and to transmit one or more content claims returned by the query to the client device. In some embodiments the server(s) may be further configured to receive the content claim(s) and the digital fingerprint associated therewith from one or more computers on another network node and to cause the received content claim(s) and digital fingerprint associated therewith to be stored in the database.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

20 November 2012

Publication Number

16/2014

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

lsmds@lakshmisri.com

Parent Application

Patent Number

Legal Status

Grant Date

2022-11-21

Renewal Date

Applicants

MICROSOFT CORPORATION

One Microsoft Way Redmond Washington 98052 6399

Inventors

1. BISSO Robert

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

2. ISMAILOV Vadim

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

3. LIU Lingling

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

4. SACCONE Robert

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

5. BEHER Mukeshkumar

c/o Microsoft Corporation LCA International Patents One Microsoft Way Redmond Washington 98052 6399

Specification

CLAIM BASED CONTENT REPUTATION SERVICE
BACKGROUND
[0001] In order to ensure that digital data complies with business, security and other
policies, the trend in recent years has been to subject such data to an ever increasing
number of pre-access evaluation processes. Examples of such processes include hygiene
scans, filtering, classifications, and data analysis. Particularly computationally intensive
operations may include, for example, virus/spyware scans, spam detection, keyword
detections, malicious/inappropriate/prohibited URL detection, data leakage prevention,
data classification, etc.
[0002] The number of scanning/classification technologies that a piece of content needs
be subjected to has continued to increase over time. In addition, the size of a typical piece
of content that needs to be scanned has trended upwards and has shown no sign of leveling
off. Both of these trends result in an ever increasing amount of computer resources (CPU,
memory, network bandwidth, etc.) that are needed to perform scanning/classification.
[0003] The problem is further exacerbated by the fact that the data generally needs to be
repeatedly re-analyzed, rescanned, and/or reclassified by various security and compliance
products as it moves within or across computers networks. These products are typically
installed on desktops, notebooks, different servers (like mail, file, collaboration, etc.), and
services in the cloud. As data traverses each of these way points, the same
computationally intensive operations are often performed over and over again. This leads
to decreased performance and throughput of the system and requires installation of
additional hardware, software, etc. In the case of services, the additional overhead can
have a direct impact on the profitability of the service.
SUMMARY
[0004] In some embodiments of the invention, a system may comprise a database and
one or more servers. The database may, for example, store a plurality of content claims
for previously evaluated data items, with each of the plurality of content claims being
associated in the database with a corresponding stored digital fingerprint of a previously
evaluated data item. The server(s) may, for example, be configured to receive a
determined digital fingerprint of a data item from a client device on another network node,
to submit a query to the database using the determined digital fingerprint as a primary key,
and to transmit one or more content claims returned by the query to the client device.
[0005] In some embodiments, the server(s) may be further configured to receive the
content claim(s) and the digital fingerprint associated therewith from one or more
computers on another network node, and to cause the received content claim(s) and digital
fingerprint associated therewith to be stored in the database.
[0006] In some embodiments, the client device may comprise one or more computers
configured to process the data item with a hash function to determine the fingerprint of the
data item, to send a first message to the server(s) comprising the determined digital
fingerprint of the data item, to receive a second message from the server(s) comprising the
content claim(s) returned by the query, and to make a decision as to how to further process
the data item based upon the content claim(s) included in the second message.
[0007] In some embodiments, a method for identifying one or more content claims for a
data item involves comparing a digital fingerprint of the data item with a stored digital
fingerprint associated with the content claim(s). If the determined digital fingerprint
matches the stored digital fingerprint, then it is determined that the one or more content
claims are associated with the determined digital fingerprint of the data item.
[0008] In some embodiments, one or more computer-readable storage mediums are
encoded with instructions that, when executed by one or more processors at a first network
node, cause the processor(s) to perform a method for identifying one or more content
claims for a data item that includes steps of (a) comparing a determined digital fingerprint
of the data item with a stored digital fingerprint associated with the content claim(s), and
(b) if the determined digital fingerprint matches the stored digital fingerprint, then
determining that the one or more content claim are associated with the determined digital
fingerprint of the data item.
[0009] In some embodiments, the content claim(s) and the digital fingerprint associated
therewith may be received from one or more computers at another network node, and the
received content claim(s) and digital fingerprint associated therewith may be persistently
stored.
[0010] In some embodiments, the determined digital fingerprint may be received from
one or more computers at another network node, and the content claim(s) determined to be
associated with the determined digital fingerprint of the data item may be transmitted to
one or more computers at the other network node.
[0011] In some embodiments, a content certificate including both the stored digital
fingerprint and the content claim(s) may be received from one or more computers at
another network node.
[0012] In addition to or in lieu of the foregoing illustrative embodiments, one or more of
the following characteristics, features and/or functions may additionally or alternatively be
present in or practiced by some embodiments of the invention:
• The results of content hygiene, filtering, classification and other content analysis
processes may be expressed as a set of content claims.
• Fingerprinting may be used as a non-intrusive and reliable mechanism for
associating content claims with the data that was processed.
• The content reputation service may accept, aggregate, and store content claims
submitted by participating trusted parties.
• The content reputation service may be queried for claims associated with a particular
piece of data.
• Existing content claims may be invalidated when the configuration and/or security
policy that was used when these claims were issued is changed.
• Time sensitive claims may be invalidated and removed from the reputation service
database when predefined time period has elapsed.
• The content reputation service may be independent of the data formats that are being
protected. For example, when a fingerprint is calculated, data may be treated simply
as a byte stream, rather than data of a particular format, such that knowledge of data
formats is not required by the system.
• The content reputation service may be independent of the transport protocols used to
transfer data.
• The content reputation service may be independent of the storage type where data is
stored.
• The content reputation service may be communicated with (for the purpose of
submitting and requesting claims) using different network protocols.
• Use of the content reputation service may ensure that claims that were created for a
file (or other digital content) will be associated with all other copies of the same file.
• Exporting a content claim set and serializing it into secure and verifiable "content
certificate" may allow it to be transmitted to interested parties that for some reason
cannot communicate directly with the reputation service (or not aware of it
altogether). The system may, for example, leverage known and accepted standards
based technologies to format and secure such a content certificate. A client
application may, for example, be able to associate a content certificate with the data
it was issued about as well as to read one or more of the claims about such data.
BRIEF DESCRIPTION OF DRAWINGS
[0013] The accompanying drawings are not intended to be drawn to scale. In the
drawings, each identical or nearly identical component that is illustrated in various figures
is represented by a like numeral. For purposes of clarity, not every component may be
labeled in every drawing. In the drawings:
[0014] FIG. 1 is diagram showing an illustrative example of a network in which various
clients may access a content reputation service;
[0015] Fig. 2 shows an illustrative example of a routine a client may execute to submit
one or more claims to a content reputation service;
[0016] Fig. 3 shows an illustrative example of a routine a client may execute to obtain
one or more existing claims maintained by a content reputation service;
[0017] Fig. 4 is a conceptual representation of a claim set including actual claim data for
an image;
[0018] Fig. 5 shows an illustrative example of a routine that may be executed by a server
of a content reputation service to process a client request to submit a claim;
[0019] Fig. 6 shows an illustrative example of a routine a server of a content reputation
service may execute to process a client request to retrieve and return a content claim; and
[0020] Fig. 7 shows an illustrative example of how the claim set shown in Fig. 4 may
appear when exported as a content certificate.
DETAILED DESCRIPTION
[0021] We have recognized that the redundant scanning performed by existing systems
occurs because an application or service running on one computer is unable to leverage
results that were produced by other applications (running on the same computer or one or
more different computers) over the same content. Antivirus applications are a good
example. In existing systems, a file (and possibly its identical copies) moving within an
organizations generally needs to be repeatedly scanned as it moves between different
computers and servers (e-mail, file, collaboration etc.). We have further recognized that
such repetitive scans and classifications may be avoided, for example, by providing a
secure way of sharing results of prior scans and classifications with all instances of the
same application or service and other interested parties who can leverage these results.
[0022] In some embodiments of the present invention, the results of scans,
classifications, or any other operations performed over digital content as set of content
claims may be persisted in a centralized repository, accessible by interested parties, in
such a way that the claims are associated with the data over which they were generated. In
some embodiments, for example, this may be accomplished through the use of a
centralized content reputation service that is accessible over a network. Such a
mechanism may allow future rescans and/or reclassifications of the unmodified (or
duplicated) data to be avoided entirely, or at least in part. In some embodiments, the
results (claim sets) may be stored separately from the data to which the claim pertains and
the process would not require any modifications to the data itself. Such a solution may
thus ensure the integrity and authenticity of any issued claims.
[0023] In some embodiments, the results of various types of content based hygiene
and/or filtering technologies that are performed during content based
analysis/inspection/scan may be made available as a set of content claims. Trusted
services and applications may, for example, submit results of their operations to the
content reputation service for storage along with an identifier that may be used to later
access such results. In some embodiments, for example, a digital fingerprint of the
evaluated content may be used as such an identifier. Any participating party may
thereafter request existing claims for a given piece of digital content by calculating the
fingerprint of the subject data (or otherwise determining the identifier) and sending it as
part of a request to the content reputation service. The content reputation service may then
return claims (if any) associated with the data to the requestor. The content reputation
service may, for example, store claims in a relational database that uses the identifier as a
key.
[0024] Such a technique may thus allow content claims to be reused at a future time for
various purposes. One such purpose may be to avoid repetitive analysis/inspection/scan of
the data when it doesn't change as it traverses computers on a network. Another such
purpose may be to enable consumers of the data to make verifiable trust decisions using
the content claims set. Yet another purpose may be to reduce consumption of computer
resources used for hygiene, filtering, classification and other content inspections across the
network. Furthermore, in some embodiments, the analysis of content claims residing in
the database of the content reputation service may provide viable statistical information in
regards to data usage, geographical migration of data, sources of infection, etc.
[0025] Fig. 1 shows an illustrative embodiment of a system 100 in which a content
reputation service 102 may be accessed over a network 104. As shown, the content
reputation service 102 may, for example, comprise one or more servers 106 (or other
computing devices) having sufficient computing capabilities to process requests from all
of the other devices in the system 100 that need to access the service 102. The computing
devices of the content reputation service 102 can take on any of numerous forms,
depending on the operational requirements and scale of the system 100 in which it is
deployed, and the invention does not require the use of computing devices of any
particular type or configuration. For large scale implementations, conventional scaling
and/or load balancing techniques may, for example, be employed to distribute the
processing of requests among various servers and/or other computing resources deployed
within the service 102.
[0026] The database 108 may take on any of numerous forms and configurations, and
the invention is not limited to the use of any particular type of database. In some
embodiments, for example, the database 108 may be a relational database that stores and
accesses one or more content claims associated with particular pieces of digital content
using keys. In some embodiments, such keys may, for example, comprise digital
fingerprints of the digital content for which the service maintains one or more content
claims. Depending on performance and database size considerations, various tables and
foreign keys may be employed to enhance performance. For example, in some
embodiments, a table may map fingerprints to internal identity keys, and each such
internal identity key may be used to access one or more content claims associated
therewith. It should be appreciated, however, that in other embodiments, the database
108 may comprise any other database architecture or storage mechanism capable of
associating identifiers (e.g., fingerprints) with corresponding stored content claims. In
some embodiments, the content claims could even be stored in address-indexed storage
device, e.g., a hard drive or RAM, and a table could be used to map fingerprints (or other
another identifier) to memory addresses of corresponding content claims. As used herein,
the term "database" is intended to encompass all such storage architectures.
[0027] The network 104 may be any of numerous networks, or groups of networks, and
the invention is not limited to the use of a network of any particular type of configuration.
The network 104 may, for example, comprise a local area network such as that used in a
corporate environment and/or a wide area network such as the internet. Any network
architecture and/or communication protocol may be employed in various embodiments.
But a few examples of suitable networks and protocols include Ethernet, token-ring,
TCP/IP, HTTP, SOAP, REST, RPC, XML-PRC, etc.
[0028] As shown in Fig. 1, in addition to the content reputation service 102 itself, the
system 100 may additionally comprise various clients that can communicate with the
service 102 over the network 104. Several examples of computing devices that might be
clients of the service (e.g., because they are capable of submitting content claims to the
service 102 and/or are capable of accessing content claims maintained by the service) are
illustrated. It should be appreciated, however, that types of devices in addition to or in
lieu of those shown may be employed, and the invention is not limited to the use of any
particular type of device. In the example of Fig. 1, the illustrated clients include a laptop
computer 110, a printer/fax/scanner 112, a server 114, a desktop computer 116, and a
handheld computing device (e.g., a PDA, smartphone, etc.) 118.
[0029] As used herein, a "network node" refers to a device or group of devices that has
or share a unique address, or address component, on a network. In some circumstances, a
given network node may comprise one or more sub-nodes. In such a case, one component
of a network address may uniquely identify the node on the network and another
component of the address may uniquely identify each of the sub-nodes. In the example of
Fig. 1, each of the content reputation service 102, the laptop computer 110, the
printer/fax/scanner 112, the server 114, the desktop computer 116, and the handheld
computing device 118 may be located at a different node of the network 104.
[0030] In some embodiments, the content reputation service 102 may, for example,
accept, aggregate, store, and furnish upon request claims about digital content (files or any
other type of data). Additionally, in some embodiments, steps may be taken to ensure that
only trusted parties are allowed to submit claims to the service 102. In such embodiments,
claims submitted from unknown or un-trusted clients will not be accepted. In some
embodiments, there are no restrictions as to which clients are allowed to lookup existing
claims.
[0031] As noted above, in some embodiments, claims may be associated with the data
via digital fingerprints. Calculation of a digital fingerprint may be done in any of
numerous ways, and the invention is not limited to any particular fingerprinting technique.
In some embodiments, for example, fingerprints may be calculated using a cryptographic
hash function. Such an implementation may provide good uniformity in resulting
fingerprints and, depending on the hash function being used, may dramatically minimize
the possibility of collisions (either accidental or intentional). Because cryptographic hash
is a one-way function, it is impossible to deduce original content (or even its nature) from
the hash value. Examples of suitable hash functions are SHA1, SHA-256, and SHA-512.
It should be appreciated, however, that other fingerprinting techniques could additionally
or alternative employed in some embodiments. For example, for applications where data
security is not an issue, a non-cryptographic hash function could additionally or
alternatively be employed.
[0032] Digital fingerprints can generally be reliably determined with minimal
computational effort and same piece of data will always yield the same digital fingerprint.
Accordingly, in embodiments that use digital fingerprints as claim identifiers, any
modification to the data will result in a different fingerprint and will automatically break
the association of all existing claims with the modified copy of the file.
[0033] Figs. 2 and 3 show examples of how clients may use fingerprints when
interacting with the content reputation service 102. In particular, Fig. 2 shows an
illustrative example of a routine that a client (e.g., one of the computing devices 110, 112,
114, 116, and 118 shown in Fig. 1) may execute to submit one or more claims to the
content reputation service 102, and Fig. 3 shows an example of a routine a client may
execute to obtain one or more existing claims maintained by the service 102. The
illustrated routines may, for example, be implemented using instructions stored in a
computer-readable medium that can be accessed and executed by a processor of a client
machine.
[0034] As shown in Fig. 2, after processing data (e.g., performing a virus or malware
scan, classifying data, etc.) (step 202), a client may create one or more claims expressing
the results of the processing that was performed (step 204). After creating the one or more
claims, the client may calculate a fingerprint of the data (step 206), and then submit the
one or more claims, together with the calculated data fingerprint, to the content reputation
service 102, e.g., by sending a message to the service 102 via the network 104 (step 208).
As discussed further below, in some embodiments, such a message may include a clientside
digital certificate, signed by a trusted Certification Authority, identifying the product
used to create the one or more claims. Such a technique may, for example, help prevent
potential attacks intended to poison the database 108.
[0035] As shown in Fig. 3, prior to committing resources to evaluating the content of a
particular piece of data (e.g., by performing a virus or malware scan, classifying data,
etc.), a client may calculate a fingerprint for the data (step 302), and submit the fingerprint
to the content reputation service 102 as a part of a request for existing claims (step 304).
If the reputation service 102 identifies any claims associated with the submitted
fingerprint, it may retrieve and return those claims to the client (step 306). The client may
then make a further decision as to whether and/or how to evaluate the data based upon the
information contained in any returned claims (step 308). In some embodiments, messages
received by the content reputation service 102 may carry a valid, verifiable certificate, thus
allowing clients to confirm that they are communicating with a trusted source. In addition,
in some embodiments, the returned claims may also be digitally signed so as to further
enhance the security and reliability of the system.
[0036] In some embodiments, an unlimited number of content claims may be associated
with a given piece of data. Although such claims may be created by different trusted
issuers, when they are issued over the same piece of data (which yields same fingerprint
value), they all may be grouped by the data fingerprint.
[0037] In certain implementations, when a client makes a request of the content
reputation service 102 to return claims about digital content, the client may either request
all existing claims or narrow the scope of the returned set by specifying the type of claims
it is interested in (e.g., issuer, time claims were issued, content assertions etc.).
[0038] As noted above, in some implementations, any modification to the data will lead
to a different calculated fingerprint. Thus, any modification to a file will automatically
disassociate the file with all previously issued claims.
[0039] Table 1 (below) shows an illustrative example of the properties/attributes that
may be contained within a single content claim.
In addition to predefined assertions, claim issuers may define their own
assertions.
Issuer Entity that issued this claim. This may, for example, identify a particular
application, service, user, etc.
Data Optional custom data issuer may attach to claim. The reputation service
need not attempt to interpret this data; it may simply store it and return
it back with the claim, when requested. For example, an antivirus
application may submit the virus engine or signature version with each
virus claim it makes. This way, when it receives claims back, it can
determine that the virus engine or signatures were updated since the
claim it just received was issued. It may thus decide to scan file
anyway, despite existing claim that file is clean and then submit a new
claim.
TABLE 1
[0040] As pointed out previously, in some embodiments, multiple claims, potentially
issued by different entities, may exist in the database 108 of the content reputation service
102. When requested, such claims may be returned as a "claim set."
[0041] Table 2 (below) shows an illustrative example of how such a claim set may be
formatted.
ClaimSet
Fingerprinting Algorithm
Fingerprint Value
Claim 1
Claim 2
Claim . . .
Claim N
TABLE 2
[0042] Fig. 4 is a conceptual representation of a claim set including actual claim data (in
this case, for an image).
[0043] In some embodiments, different claim sets may be returned in different formats
depending on the protocol used to communicate with the content reputation service 102.
Clients may, for example, communicate with the service using SOAP messages (web
service) or any other network protocol that supports either connection or packet/message
based security (e.g., REST, HTTP, RPC, XML-PRC etc.). In some embodiments,
implementation of the service may also support multiple bindings and/or be able to
communicate using different protocols at the same time. The content reputation service
may be installed on premises, in the cloud, or both.
[0044] As noted above, in some embodiments, in order to prevent database poisoning
and other types of attacks, only trusted applications may be allowed to submit content
claims. The trust mechanism may, for example, employ widely used industry standards
such as WS-Trust and server and client side certificates for such a purpose.
[0045] In some embodiments, regardless of the protocol being used to communicate
with the content reputation service 102, all content claims may be stored in centralized
relational database 108 with the fingerprint used as the primary key.
[0046] Figs. 5 and 6 show examples of how the content reputation service 102 may
process received requests from clients to submit and retrieve content claims. In particular,
Fig. 5 shows an illustrative example of a routine one or more servers 106 of the content
reputation service 102 may execute, together with the database 108, to process a client
request to submit one or more claims, and Fig. 6 shows an example of a routine such
server(s) may execute, together with the database, to process a client request to retrieve
and return one or more content claims. The illustrated routines may, for example, be
implemented using instructions stored in one or more computer-readable mediums that can
be accessed and executed by one or more processors of the server(s) 106 and/or controllers
of the database 108.
[0047] As shown in Fig. 5, upon receiving a client request to submit one or more claims
to the service 102 (step 502), if the same claim (e.g., the same claim type and assertion)
does not already exist in the database 108 (see step 503), a server 106 may format the
received claim(s) and write them the database 108 using the received fingerprint as the
primary key for the database entry (step 504). As noted above, in some embodiments, the
server(s) 106 may refuse to accept any new claim submissions that is not signed by a
trusted Certification Authority or does not identify the product used to create the claim.
[0048] In some embodiments, if the same claim is found to already exist in the
database 108 for the submitted digital fingerprint (see step 503), the content reputation
service 102 may evaluate the content of the new claim against that of the existing claim
and update some or all of the information in the claim based upon that evaluation (step
505).
[0049] One example of a scenario in which the content reputation service 102 may
update information for an existing claim is where, for example, a newly-submitted claim
contains a virus signature version that is more recent than the virus signature version of an
existing claim associated with the same digital fingerprint. Such a scenario may occur, for
example, when a client decides to scan a file in spite of existence of an existing claim for
the file because the client possesses a more recent virus signature version than that which
is reflected in the existing claim. After performing the scan using the updated virus
signature version, the client in such a scenario may, for example, submit a virus claim
(reflecting the virus signature version that was employed for the scan) to the content
reputation service 102.
[0050] When content reputation service 102 receives such a claim, it may, for example,
determine that a claim of the same type, with the same assertion (and possibly even from
the same issuer) already exists. The content reputation service 102 may then, for example,
compare the virus signature versions, as well as the creation dates and times of the
respective claims, and update entries in the database 108 (e.g., database columns) with
what it determines to be the most up-to-date and reliable information for the claim. In the
case of an updated virus signature version, the updated entries for the existing claim may,
for example, include the date and time of the virus scan and the virus signature version
used for the scan.
[0051] As shown in Fig. 6, upon receiving a client request to retrieve existing claims
(step 602), a server 106 may formulate a database query using the fingerprint included in
the client request as the primary key (step 604). If a matching fingerprint exists in the
database 108 (step 606), the server 106 may retrieve the claim(s) associated with the
fingerprint (step 608), filter those claims according to any criteria specified by the client
(step 609), and return the filtered claim data (which may be either a subset of the retrieved
claim data or the entire claim set if filtering is not employed or if no filtering criteria are
specified by the client) to the requesting client (step 610). If a matching fingerprint does
not exist in the database (step 606), the server 106 may inform the client that no matching
claims were found (step 612). The content reputation service 102 may thus identify any
claims that are associated with a given piece of data by comparing a calculated fingerprint
for the data (received from the client) with a fingerprint stored in the database that is
associated with the claims for the data in question.
[0052] In practice, content claims may be created when data is subjected to a certain
type of analysis for the first time within the system 100. Thereafter, as data travels within
the system and needs to be accessed, previously issued claims may be used in order to get
necessary information about the data without analyzing it all over again. In some
embodiments, additional claims about data may be added when new types of scans are
performed on the data, thus extending the claim set with new information.
[0053] The following practical example illustrates how the content reputation service
102 may be employed to minimize the resources that need to be devoted to examining the
content of a particular piece of data. First, consider the common situation where a
document file (e.g., a MICROSOFT WORD ® document) is attached to an e-mail that is
sent to somebody within an organization. Upon receiving the e-mail with the attachment,
the organization's edge server may scan the attachment, determine that it is free of virus,
spyware, and malicious URLs, and create three claims with the content reputation service
102. The recipient may then, for example, receive the file and upload it to the
organization's internal SHAREPOINT ® site. (Suppose that the security policy that is
enforced on this SHAREPOINT ® site requires that all files be scanned for viruses,
spyware, malicious URLs, and DLP.) During upload, the security scanner for the
SHAREPOINT ® site may calculate the file's fingerprint and send a request to the content
reputation service 102. The claim set, including the three previously created claims, may
then be returned. As a result, the security scanner may determine that only a DLP scan
needs to be performed on the file and, after performing such a scan, may issue additional
DLP claims to the content reputation service. Thereafter, if, for example, the same file is
uploaded to another SHAREPOINT ® site within the organization, the security scanner
for that SHAREPOINT ® site may determine that no other scans need to be performed,
because a request to the content reputation service by that security scanner will return all
necessary claims.
[0054] Importantly, in most circumstances, the overhead caused by interaction with the
content reputation service 102 may be significantly lower than that of an actual scan or
other data evaluation process. It should also be noted that, in some embodiments,
inclusion of additional data inspection processes (which increases scan/evaluation time)
will not have an adverse effect on claim submission and lookup time.
[0055] In certain embodiments, the content reputation service 102 may export a content
claim or claim set as digitally signed file, e.g., an XML file. This "content certificate"
may, for example, be delivered with or without (if the recipient already possesses this
data) corresponding data to parties who for one reason or another have no access to the
content reputation service and cannot communicate with it directly. Despite this fact, the
recipient may reliably verify the validity of the content certificate, and, if valid, may
decide to trust some or all of the included content claims.
[0056] Fig. 7 shows an example of how the claim set shown in Fig. 4 may appear when
exported as a content certificate. Such an XML file may, for example, be signed using
enveloped XML digital signature. The recipient of such a file may first verify the digital
signature, to make sure that integrity of the XML is intact. If the signature is valid, the
recipient may, for example, calculate the fingerprint of the data using an algorithm
specified in the XML. If resulting value matches the fingerprint value in the XML, then
the recipient may determine that all claims in this XML file are relevant to the data and
can be trusted. Thus, in some embodiments, the recipient of such a content certificate is
able to identify one or more claims that are associated with a given piece of data by
comparing the calculated fingerprint for the data with a fingerprint included in the content
certificate that also contains the claim(s). In some embodiments, a verification tool may
be provided that automates verification of such content certificates.
[0057] Although perhaps not desirable in at least some circumstances, in some
embodiments, content certificates may additionally or alternatively be directly appended to
or embedded within a file to which it pertains. One example of a file type where such an
implementation may be possible is email. A content certificate may, for example, be
placed in the header space of the email without affecting the rest of the mail content.
Some file formats, e.g., MICROSOFT OFFICE ® files, also allow for extensibility where
additional payload may be stored. Additionally, in some embodiments, a generic file
wrapping envelope that stores both the original file and the content certificate may be
employed. Microsoft's ® Generic File Protection (GFP) file wrapper may, for example,
be used for such a purpose.
[0058] The use of content certificates for data items (whether as separate files or as
information that is appended to or embedded within such items) may also offer some
additional flexibility when the data item itself has been modified. In some embodiments,
for example, at least some reclassification of content (e.g. PII, HBI, etc.) may be avoided
by employing classification technology that generates a "soft hash" which can be used to
determine how close the document is to the original. In such embodiments, if it the result
is within a tolerance, the entire reclassification process may be avoided.
[0059] Having thus described several aspects of at least one embodiment of this
invention, it is to be appreciated that various alterations, modifications, and improvements
will readily occur to those skilled in the art.
[0060] Such alterations, modifications, and improvements are intended to be part of this
disclosure, and are intended to be within the spirit and scope of the invention.
Accordingly, the foregoing description and drawings are by way of example only.
[0061] The above-described embodiments of the present invention can be implemented
in any of numerous ways. For example, the embodiments may be implemented using
hardware, software or a combination thereof. When implemented in software, the
software code can be executed on any suitable processor or collection of processors,
whether provided in a single computer or distributed among multiple computers. Such
processors may be implemented as integrated circuits, with one or more processors in an
integrated circuit component. Though, a processor may be implemented using circuitry in
any suitable format.
[0062] Further, it should be appreciated that a computer may be embodied in any of a
number of forms, such as a rack-mounted computer, a desktop computer, a laptop
computer, or a tablet computer. Additionally, a computer may be embedded in a device
not generally regarded as a computer but with suitable processing capabilities, including a
Personal Digital Assistant (PDA), a smart phone or any other suitable portable or fixed
electronic device.
[0063] Also, a computer may have one or more input and output devices. These devices
can be used, among other things, to present a user interface. Examples of output devices
that can be used to provide a user interface include printers or display screens for visual
presentation of output and speakers or other sound generating devices for audible
presentation of output. Examples of input devices that can be used for a user interface
include keyboards, and pointing devices, such as mice, touch pads, and digitizing tablets.
As another example, a computer may receive input information through speech
recognition or in other audible format.
[0064] Such computers may be interconnected by one or more networks in any suitable
form, including as a local area network or a wide area network, such as an enterprise
network or the Internet. Such networks may be based on any suitable technology and may
operate according to any suitable protocol and may include wireless networks, wired
networks or fiber optic networks.
[0065] Also, the various methods or processes outlined herein may be coded as software
that is executable on one or more processors that employ any one of a variety of operating
systems or platforms. Additionally, such software may be written using any of a number
of suitable programming languages and/or programming or scripting tools, and also may
be compiled as executable machine language code or intermediate code that is executed on
a framework or virtual machine.
[0066] In this respect, the invention may be embodied as a computer readable medium
(or multiple computer readable media) (e.g., a computer memory, one or more floppy
discs, compact discs (CD), optical discs, digital video disks (DVD), magnetic tapes, flash
memories, circuit configurations in Field Programmable Gate Arrays or other
semiconductor devices, or other non-transitory, tangible computer storage medium)
encoded with one or more programs that, when executed on one or more computers or
other processors, perform methods that implement the various embodiments of the
invention discussed above. The computer readable medium or media can be transportable,
such that the program or programs stored thereon can be loaded onto one or more different
computers or other processors to implement various aspects of the present invention as
discussed above. As used herein, the term "non-transitory computer-readable storage
medium" encompasses only a computer-readable medium that can be considered to be a
manufacture (i.e., article of manufacture) or a machine.
[0067] The terms "program" or "software" are used herein in a generic sense to refer to
any type of computer code or set of computer-executable instructions that can be
employed to program a computer or other processor to implement various aspects of the
present invention as discussed above. Additionally, it should be appreciated that
according to one aspect of this embodiment, one or more computer programs that when
executed perform methods of the present invention need not reside on a single computer or
processor, but may be distributed in a modular fashion amongst a number of different
computers or processors to implement various aspects of the present invention.
[0068] Computer-executable instructions may be in many forms, such as program
modules, executed by one or more computers or other devices. Generally, program
modules include routines, programs, objects, components, data structures, etc. that
perform particular tasks or implement particular abstract data types. Typically the
functionality of the program modules may be combined or distributed as desired in various
embodiments.
[0069] Also, data structures may be stored in computer-readable media in any suitable
form. For simplicity of illustration, data structures may be shown to have fields that are
related through location in the data structure. Such relationships may likewise be achieved
by assigning storage for the fields with locations in a computer-readable medium that
conveys relationship between the fields. However, any suitable mechanism may be used
to establish a relationship between information in fields of a data structure, including
through the use of pointers, tags or other mechanisms that establish relationship between
data elements.
[0070] Various aspects of the present invention may be used alone, in combination, or in
a variety of arrangements not specifically discussed in the embodiments described in the
foregoing and is therefore not limited in its application to the details and arrangement of
components set forth in the foregoing description or illustrated in the drawings. For
example, aspects described in one embodiment may be combined in any manner with
aspects described in other embodiments.
[0071] Also, the invention may be embodied as a method, of which an example has been
provided. The acts performed as part of the method may be ordered in any suitable way.
Accordingly, embodiments may be constructed in which acts are performed in an order
different than illustrated, which may include performing some acts simultaneously, even
though shown as sequential acts in illustrative embodiments.
[0072] Use of ordinal terms such as "first," "second," "third," etc., in the claims to
modify a claim element does not by itself connote any priority, precedence, or order of one
claim element over another or the temporal order in which acts of a method are performed,
but are used merely as labels to distinguish one claim element having a certain name from
another element having a same name (but for use of the ordinal term) to distinguish the
claim elements.
[0073] Also, the phraseology and terminology used herein is for the purpose of
description and should not be regarded as limiting. The use of "including," "comprising,"
or "having," "containing," "involving," and variations thereof herein, is meant to
encompass the items listed thereafter and equivalents thereof as well as additional items.
What is claimed is:
1. A method for identifying a content claim for a data item, comprising steps
of:
(a) with at least one computer at a first network node, comparing a determined
digital fingerprint of the data item with a stored digital fingerprint associated with at least
one content claim; and
(b) if the determined digital fingerprint matches the stored digital fingerprint, then,
with at least one computer at the first network node, determining that the at least one
content claim is associated with the determined digital fingerprint of the data item.
2. The method of claim 1, further comprising steps of:
(c) prior to performing the steps (a) and (b), with at least one computer at the first
network node, receiving the at least one content claim and the digital fingerprint
associated therewith from at least one computer at a second network node; and
(d) prior to performing the steps (a) and (b), persistently storing the received
content claim and digital fingerprint associated therewith in memory accessible to the at
least one computer at the first network node.
3. The method of claim 2, further comprising a step of:
(e) prior to performing the step (d), with at least one computer at the first network
node, verifying that the received content claim was generated by a trusted source.
4. The method of any of claims 1-3, further comprising steps of:
prior to performing the steps (a) and (b), with at least one computer at the first
network node, receiving the determined digital fingerprint from at least one computer at a
second network node; and
after performing the steps (a) and (b), with at least one computer at the first
network node, transmitting the at least one content claim determined to be associated with
the determined digital fingerprint of the data item to at least one computer at the second
network node.
5. The method of any of claims 1-4, further comprising steps of:
with at least one computer at a second network node, sending a first message to at
least one computer at the first network node, the first message comprising the determined
digital fingerprint of the data item; and
after performing the step (a), with at least one computer at the second network
node, receiving a second message from at least one computer at the first network node, the
second message comprising at least one content claim that at least one computer at the first
network node identified as being associated with the determined digital fingerprint.
6. The method of claim 5, further comprising steps of:
prior to sending the first message and receiving the second message, with at least
one computer at the second network node or at least one computer at a third network node,
sending a third message to at least one computer at the first network node, the third
message comprising a determined fingerprint of the data item and the at least one content
claim associated therewith; and
prior to performing the steps (a) and (b), persistently storing the digital fingerprint
and at least one content claim included in the third message in memory accessible to the at
least one computer at the first network node.
7. The method of claim 6, further comprising a step of:
prior to sending the third message, with at least one computer at the second
network node or at least one computer at the third network node, evaluating the data item
to yield the at least one content claim.
8. The method of any of claims 5-7, further comprising a step of:
prior to sending the first message, with at least one computer at the second network
node, processing the data item with a cryptographic hash function to determine the
fingerprint of the data item.
9. The method of any of claims 5-8, further comprising a step of:
after receiving the second message, with at least one computer at the second
network node, making a decision as to how to further process the data item based upon the
at least one content claim included in the second message.
10. The method of claim 1, further comprising a step of:
(c) prior to performing the steps (a) and (b), with at least one computer at the first
network node, receiving from at least one computer at a second network node a content
certificate including both the stored digital fingerprint and the at least one content claim.
11. The method of any of claims 1-10, wherein the at least one content claim
comprises at least one result of an antivirus or malware scan.
12. At least one computer-readable storage medium encoded with instructions
that, when executed by one or more processors, cause the one or more processors to
perform the method of any of claims 1-11.
13. A system, comprising:
a database storing a plurality of content claims for previously evaluated data items,
each of the plurality of content claims being associated in the database with a
corresponding stored digital fingerprint of a previously evaluated data item; and
at least one server configured to receive a determined digital fingerprint of a data
item from a client device on another network node, to submit a query to the database using
the determined digital fingerprint as a primary key, and to transmit at least one content
claim returned by the query to the client device.
14. The system of claim 13, wherein the at least one server is further
configured to receive the at least one content claim and the digital fingerprint associated
therewith from at least one computer on another network node, and to cause the at least
one received content claim and digital fingerprint associated therewith to be stored in the
database.
15. The system of claim 13 or 14, wherein the client device comprises at least
one computer configured to process the data item with a hash function to determine the
fingerprint of the data item, to send a first message to the at least one server comprising
the determined digital fingerprint of the data item, to receive a second message from the at
least one server comprising the at least one content claim returned by the query, and to
make a decision as to how to further process the data item based upon the at least one
content claim included in the second message.

Documents

Application Documents

#	Name	Date
1	9789-CHENP-2012 POWER OF ATTORNEY 20-11-2012.pdf	2012-11-20
2	9789-CHENP-2012 FORM-5 20-11-2012.pdf	2012-11-20
3	9789-CHENP-2012 FORM-3 20-11-2012.pdf	2012-11-20
4	9789-CHENP-2012 FORM-2 FIRST PAGE 20-11-2012.pdf	2012-11-20
5	9789-CHENP-2012 FORM-1 20-11-2012.pdf	2012-11-20
6	9789-CHENP-2012 DRAWINGS 20-11-2012.pdf	2012-11-20
7	9789-CHENP-2012 DESCRIPTION (COMPLETE) 20-11-2012.pdf	2012-11-20
8	9789-CHENP-2012 CLAIMS 20-11-2012.pdf	2012-11-20
9	9789-CHENP-2012 PCT PUBLICATION 20-11-2012.pdf	2012-11-20
10	9789-CHENP-2012 CORRESPONDENCE OTHERS 20-11-2012.pdf	2012-11-20
11	9789-CHENP-2012.pdf	2012-11-21
12	9789-CHENP-2012 FORM-3 09-05-2013.pdf	2013-05-09
13	9789-CHENP-2012 CORRESPONDENCE OTHERS 09-05-2013.pdf	2013-05-09
14	abstract9789-CHENP-2012.jpg	2014-03-14
15	9789-CHENP-2012 FORM-6 26-02-2015.pdf	2015-02-26
16	MTL-GPOA - JAYA.pdf	2015-03-13
17	MS to MTL Assignment.pdf	2015-03-13
18	FORM-6-1801-1900(JAYA).22.pdf	2015-03-13
19	9789-CHENP-2012-FER.pdf	2019-07-29
20	9789-CHENP-2012-PETITION UNDER RULE 137 [23-01-2020(online)].pdf	2020-01-23
21	9789-CHENP-2012-OTHERS [23-01-2020(online)].pdf	2020-01-23
22	9789-CHENP-2012-FORM 3 [23-01-2020(online)].pdf	2020-01-23
23	9789-CHENP-2012-FER_SER_REPLY [23-01-2020(online)].pdf	2020-01-23
24	9789-CHENP-2012-DRAWING [23-01-2020(online)].pdf	2020-01-23
25	9789-CHENP-2012-CLAIMS [23-01-2020(online)].pdf	2020-01-23
26	9789-CHENP-2012-ABSTRACT [23-01-2020(online)].pdf	2020-01-23
27	9789-CHENP-2012-PatentCertificate21-11-2022.pdf	2022-11-21
28	9789-CHENP-2012-IntimationOfGrant21-11-2022.pdf	2022-11-21

Search Strategy

1	search_26-07-2019.pdf

ERegister / Renewals

3rd: 20 Jan 2023

From 17/05/2013 - To 17/05/2014

4th: 20 Jan 2023

From 17/05/2014 - To 17/05/2015

5th: 20 Jan 2023

From 17/05/2015 - To 17/05/2016

6th: 20 Jan 2023

From 17/05/2016 - To 17/05/2017

7th: 20 Jan 2023

From 17/05/2017 - To 17/05/2018

8th: 20 Jan 2023

From 17/05/2018 - To 17/05/2019

9th: 20 Jan 2023

From 17/05/2019 - To 17/05/2020

10th: 20 Jan 2023

From 17/05/2020 - To 17/05/2021

11th: 20 Jan 2023

From 17/05/2021 - To 17/05/2022

12th: 20 Jan 2023

From 17/05/2022 - To 17/05/2023

13th: 20 Jan 2023

From 17/05/2023 - To 17/05/2024

14th: 15 May 2024

From 17/05/2024 - To 17/05/2025