How To Time-Stamp A Digital Document PDF
How To Time-Stamp A Digital Document PDF
Abstract. The prospect of a world in which all text, audio, picture, and video
documents are in digital form on easily modifiable media raises the issue of how
to certify when a document was created or last changed. The problem is to
time-stamp the data, not the medium. We propose computationally practical
procedures for digital time-stamping of such documents so that it is infeasible for
a user either to back-date or to forward-date his document, even with the collusion
of a time-stamping service. Our procedures maintain complete privacy of the
documents themselves, and require no record-keeping by the time-stamping
service.
Key words. Time-stamp, Hash.
1. Introduction
In many situations there is a need to certify the date a document was created or last
modified. For example, in intellectual property matters, it is sometimes crucial to
verify the date an inventor first put in writing a patentable idea, in order to establish
its precedence over competing claims.
One accepted procedure for time-stamping a scientific idea involves daily
notations of one's work in a lab notebook. The dated entries are entered one after
another in the notebook, with no pages left blank. The sequentially numbered,
sewn-in pages of the notebook make it difficult to tamper with the record without
leaving telltale signs. If the notebook is then stamped on a regular basis by a notary
public or reviewed and signed by a company manager, the validity of the claim is
further enhanced. If the precedence of the inventor's ideas is later challenged, both
1 Date received: August 19, 1990. Date revised: October 26, 1990.
99
100 s. Haber and W. S. Stornetta
the physical evidence of the notebook and the established procedure serve to
substantiate the inventor's claims of having had the ideas on or before a given date.
There are other methods of time-stamping. For example, one can mail a letter to
oneself and leave it unopened. This ensures that the enclosed letter was created
before the time postmarked on the envelope. Businesses incorporate more elaborate
procedures into their regular order of business to enhance the credibility of their
internal documents, should they be challenged at a later date. For example, these
methods may ensure that the records are handled by more than one person, so that
any tampering with a document by one person will be detected by another. But all
these methods rest on two assumptions. First, the records can be examined for
telltale signs of tampering. Second, there is another party that views the document
whose integrity or impartiality is seen as vouchsafing the claim.
We believe these assumptions are called into serious question for the case of
documents created and preserved exclusively in digital form. This is because elec-
tronic digital documents are so easy to tamper with, and the change need not
leave any telltale sign on the physical medium. What is needed is a method of
time-stamping digital documents with the following two properties. First, we must
find a way to time-stamp the data itself, without any reliance on the characteristics
of the medium on which the data appears, so that it is impossible to change even
one bit of the document without the change being apparent. Second, it should be
impossible to stamp a document with a time and data different from the actual one.
The purpose of this paper is to introduce a mathematically sound and computa-
tionally practical solution to the time-stamping problem. In the sections that follow,
we first consider a naive solution to the problem, the digital safety-deposit box. This
serves the pedagogical purpose of highlighting additional difficulties associated
with digital time-stamping beyond those found in conventional methods of time-
stamping. Successive improvements to this naive solution finally lead to practical
ways to implement digital time-stamping.
2. The Setting
The setting for our problem is a distributed network of users, perhaps representing
individuals, different companies, or divisions within a company; we refer to the users
as clients. Each client has a unique identification number.
A solution to the time-stamping problem may have several parts. There is a
procedure that is performed immediately when a client desires to have a document
time-stamped. There should be a method for the client to verify that this procedure
has been correctly performed. There should also be a procedure for meeting a third
party's challenge to the validity of a document's time-stamp.
As with any cryptographic problem, it is a delicate matter to characterize precisely
the security achieved by a time-stamping scheme. A good solution to the time-
stamping problem is one for which, under reasonable assumptions about the
computational abilities of the users of the scheme and about the complexity of a
computational problem, and possibly about the trustworthiness of the users, it is
difficult or impossible to produce false time-stamps. Naturally, the weaker the
assumptions needed, the better.
How To Time-Stampa Digital Document 101
3. A Naive Solution
In this section we assume that the TSS is trusted, and describe two improvements
on the naive solution above.
4.1. Hash
Our first simplification is to make use of a family of cryptographically secure
collision-free hash functions. This is a family of functions h: {0, 1 } * ~ {0, 1}t
compressing bit-strings of arbitrary length to bit-strings of a fixed length l, with the
following properties:
1. The functions h are easy to compute, and it is easy to pick a member of the
family at random.
2. It is computationally infeasible, given one of these functions h, to find a pair
of distinct strings x, x' satisfying h(x) = h(x'). (Such a pair is called a collision
for h.)
The practical importance of such functions has been known for some time, and
researchers have used them in a number of schemes; see, for example, [7], [15], and
[16]. Damg~rd gave the first formal definition, and a constructive proof of their
existence, on the assumption that there exist one-way "claw-free" permutations [4].
For this, any "one-way group action" is sufficient [3].
Naor and Yung defined the similar notion of"universal one-way hash functions,"
which satisfy, in place of the second condition above, the slightly weaker require-
ment that it be computationally infeasible, given a string x, to compute another
string x' r x satisfying h(x) = h(x') for a randomly chosen h. They were able to
construct such functions on the assumption that there exist one-to-one one-way
functions [17]. Rompel has recently shown that such functions exist if there exist
one-way functions at all [20]. See Section 6.3 below for a discussion of the differences
between these two sorts of cryptographic hash functions.
There are practical implementations of hash functions, for example, that of Rivest
[19], which seem to be reasonably secure.
We use the hash functions as follows. Instead of transmitting his document x to
the TSS, a client will send its hash value h(x) = y instead. For the purposes of
authentication, time-stamping y is equivalent to time-stamping x. This greatly
reduces the bandwidth problem and the storage requirements, and solves the
privacy issue as well. Depending on the design goals for an implementation of
time-stamping, there may be a single hash function used by everybody, or different
hash functions for different users.
For the rest of this paper, we speak of time-stamping hash values y--random-
appearing bit-strings of a fixed length. Part of the procedure for validating a
time-stamp will be to produce the preimage document x that satisfies h(x) = y;
inability to produce such an x invalidates the putative time-stamp.
4.2. Signature
The second improvement makes use of digital signatures. Informally, a signature
scheme is an algorithm for a party, the signer, to tag messages in a way that uniquely
identifies the signer. Digital signatures were proposed by Rabin [18] and by Diffie
and Hellman [7]. After a long sequence of papers by many authors, Rompel [20]
showed that the existence of one-way functions can be used in order to design a
signature scheme satisfying the very strong notion of security that was first defined
by Goldwasser et al. [10].
With a secure signature scheme available, when the TSS receives the hash value,
it appends the date and time, then signs this compound document and sends it to
the client. By checking the signature, the client is assured that the TSS actually did
process the request, that the hash was correctly received, and that the correct time
is included. This takes care of the problem of present and future incompetence on
the part of the TSS, and reduces the need for the TSS to store records.
How To Time-Stamp a Digital Document 103
What we have described so far is, we believe, a practical method for time-stamping
digital documents of arbitrary length. However, neither the signature nor the use
of hash functions in any way prevents a time-stamping service from issuing a false
time-stamp. Ideally, we would like a mechanism which guarantees that no matter
how unscrupulous the TSS is, the times it certifies will always be the correct ones,
and that it will be unable to issue incorrect time-stamps even if it tries to.
It may seem difficult to specify a time-stamping procedure so as to make it
impossible to produce fake time-stamps. After all, if the output of an algorithm
A, given as input a document x and some timing information T, is a bit-string
c = A(x, ~) that stands as a legitimate time-stamp for x, what is to prevent a
forger some time later from computing the same timing information z and then
running A to produce the same certificate c?. The question is relevant even if A is a
probabilistic algorithm.
Our task may be seen as the problem of simulating the action of a trusted TSS,
in the absence of generally trusted parties. There are two rather different approaches
we might take, and each one leads to a solution. The first approach is to constrain
a centralized but possibly untrustworthy TSS to produce genuine time-stamps in
such a way that fake ones are difficult to produce. The second approach is somehow
to distribute the required trust among the users of the service. It is not clear that
either of these can be done at all.
5.1. Linking
Our first solution begins by observing that the sequence of clients requesting
time-stamps and the hashes they submit cannot be known in advance. So if we
include bits from the previous sequence of client requests in the signed certificate,
then we know that the time-stamp occurred after these requests. But the requirement
of including bits from previous documents in the certificate can also be used to solve
the problem of constraining the time in the other direction, because the time-
stamping company cannot issue later certificates unless it has the current request
in hand.
We describe two variants of this linking scheme; the first one, slightly simpler,
highlights our main idea, while the second one may be preferable in practice. In
both variants, the TSS makes use of a collision-free hash function, denoted H. This
is in addition to clients' use of hash functions in order to produce the hash value of
any documents that they wish to have time-stamped.
To be specific, a time-stamping request consists of an/-bit string y (presumably
the hash value of the document) and a client identification number ID. We use tr(.)
to denote the signing procedure used by the TSS. The TSS issues signed, sequentially
numbered time-stamp certificates. In response to the request (Yn, IDn) from our
client, the nth request in sequence, the TSS does two things:
104 S. Haber and W. S. Stornetta
1. The TSS sends our client the signed certificate s = a(C.), where the certificate
C. = (n, t., ID., y.; Ln)
consists of the sequence number n, the time t., the client number ID. and the
hash value y. from the request, and certain linking information, which comes
from the previously issued certificate: L. = (t._t, ID._ 1, Y.-I, H(L._I)).
2. When the next request has been processed, the TSS sends our client the
identification number ID.+I for that next request.
Having received s and ID.+ 1 from the TSS, she checks that s is a valid signature of
a good certificate, i.e., one that is of the correct form (n, t, IDa, y.; Ln), containing
the correct time t.
If her time-stamped document x is later challenged, the challenger first checks
that the time-stamp (s, ID.+I) is of the correct form (with s being a signature of a
certificate that indeed contains a hash of x). In order to make sure that our client
has not colluded with the TSS, the challenger can call client ID.§ and ask him to
produce his time-stamp (s', ID.+2). This includes a signature
s' = a(n + 1, t.+ 1, ID.+ 1, Y,+I; L.+I)
of a certificate that contains in its linking information L,+~ a copy of her hash value
y.. This linking information is further authenticated by the inclusion of the image
H(L.) of her linking information L~. An especially suspicious challenger now can
call up client ID.+2 and verify the next time-stamp in the sequence; this can continue
for as long as the challenger wishes. Similarly, the challenger can also follow the
chain of time-stamps backward, beginning with client IDn_ ~.
Why does this constrain the TSS from producing bad time-stamps? First, observe
that the use of the signature has the effect that the only way to fake a time-stamp is
with the collaboration of the TSS. But the TSS cannot forward-date a document,
because the certificate must contain bits from requests that immediately preceded
the desired time, yet the TSS has not received them. The TSS cannot feasibly
back-date a document by preparing a fake time-stamp for an earlier time, because
bits from the document in question must be embedded in certificates immediately
following that earlier time, yet these certificates have already been issued. Fur-
thermore, correctly embedding a new document into the already-existing stream
of time-stamp certificates requires the computation of a collision for the hash
function H.
Thus the only possible spoof is to prepare a fake chain of time-stamps, long
enough to exhaust the most suspicious challenger that one anticipates.
In the scheme just outlined, clients must keep all their certificates. In order to
relax this requirement, in the second variant of this scheme we link each request
not just to the next request but to the next k requests. The TSS responds to the nth
request as follows:
1. As above, the certificate C. is of the form C. = (n, t., IDa, yn; L.), where now
the linking information L. is of the form
Ln = [(tn-k, IDn-k, Yn-k, H(Ln-k)) . . . . . (tn-1, ID,-1, Yn-1, H(L.-1))].
How To Time-Stamp a Digital Document 105
2. After the next k requests have been processed, the TSS sends our client the list
(ID,+I, ..., IDn+k).
After checking that this client's time-stamp is of the correct form, a suspicious
challenger can ask any one of the next k clients ID.+~ to produce his time-stamp.
As above, his time-stamp includes a signature of a certificate that contains in its
linking information L,+~ a copy of the relevant part of the challenged time-stamp
certificate C., authenticated by the inclusion of the hash by H of the challenged
client's linking information L,. His time-stamp also includes client numbers
(ID,+~+~ . . . . . ID,+i+k), of which the last i are new ones; the challenger can ask these
clients for their time-stamps, and this can continue for as long as the challenger
wishes.
In addition to easing the requirement that clients save all their certificates, this
second variant also has the property that correctly embedding a new document into
the already-existing stream of time-stamp certificates requires the computation of
a simultaneously k-wise collision for the hash function H, instead of just a pairwise
collision.
6. Remarks
6.1. Tradeoffs
There are a number of tradeoffs between the two schemes. The distributed-trust
scheme has the advantage that all processing takes place when the request is made.
In the linking scheme, on the other hand, the client has a short delay while she waits
for the second part of her certificate; and meeting a later challenge may require
further communication.
A related disadvantage of the linking scheme is that it depends on at least some
parties (clients or, perhaps, the TSS) storing their certificates.
The distributed-trust scheme makes a greater technological demand on the
system: the ability to call up and demand a quick signed response at will.
The linking scheme only locates the time of a document between the times of the
previous and the next requests, so it is best suited to a setting in which relatively
many documents are submitted for time-stamping, compared with the scale at which
the timing matters.
It is worth remarking that the time-constraining properties of the linking scheme
do not depend on the use of digital signatures.
On the other hand, if the time-stamping event can be made part of the document
creation event, then the constraint holds in both directions. For example, consider
the sequence of phone conversations that pass through a given switch. In order to
process the next call on this switch, we could require that linking information be
provided from the previous call. Similarly, at the end of the call, linking information
would be passed onto the next call. In this way, the document creation event (the
phone call) includes a time-stamping event, and so the time of the phone call can
be fixed in both directions. The same idea could apply to sequential financial
transactions, such as stock trades or currency exchanges, or any sequence of
electronic interactions that take place over a given physical connection.
time-stamping to the simple assumption that one-way functions exist. This is the
minimum reasonable assumption for us, since all of complexity-based cryptography
requires the existence of one-way functions [121 [13].
7. Applications
8. Summary
In this paper we have shown that the growing use of text, audio, and video
documents in digital form and the ease with which such documents can be modified
creates a new problem: how can we certify when a document was created or last
modified? Methods of certification, or time-stamping, must satisfy two criteria.
110 S. Haber and W. S. Stornetta
First, they must time-stamp the actual bits of the document, making no assumptions
about the physical medium on which the document is recorded. Second, the date
and time of the time-stamp must not be forgeable.
We have proposed two solutions to this problem. Both involve the use of one-way
hash functions, whose outputs are processed in lieu of the actual documents, and
of digital signatures. The solutions differ only in the way that the date and time are
made unforgeable. In the first, the hashes of documents submitted to a TSS are
linked together, and certificates recording the linking of a given document are
distributed to other clients both upstream and downstream from that document.
In the second solution, several members of the client pool must time-stamp the hash.
The members are chosen by means of a pseudorandom generator that uses the hash
of the document itself as seed. This makes it infeasible to choose deliberately which
clients should and should not time-stamp a given hash. The second method could
be implemented without the need for a centralized TSS at all.
Finally, we have considered whether time-stamping could be extended to enhance
the authenticity of documents for which the time of creation itself is not the critical
issue. This is the case for a large class of documents which we call "tamper-
unpredictable." We further conjecture that no purely algorithmic scheme can add
any more credibility to a document than time-stamping provides.
Acknowledgments
References
[1] J. Alter. When photographs lie. Newsweek, pp. 44-45, July 30, 1990.
[21 M. Blum and S. Micali. How to generate cryptographically strong sequences of pseudo-random
bits. SIAM d. Comput., 13(4): 850-864, Nov. 1984.
[3] G. Brassard and M. Yung. One-way gr•up acti•ns. •n Advances in Crypt•l•gy_Crypt• •9•. Le•ture
Notes in Computer Science, Springer-Verlag, Berlin, to appear.
[4] I. Damghrd. Collision-free hash functions and public-key signature schemes. In Advances in
Cryptology--Eurocrypt '87, pp. 203-217. Lecture Notes in Computer Science, vol. 304, Springer-
Verlag, Berlin, 1988.
[5] I. Damghrd. A design principle for hash functions. In Advances in Cryptology--Crypto '89
(ed. G. Brassard), pp. 416-427. Lecture Notes in Computer Science, vol. 435, Springer-Verlag,
Berlin, 1990.
1-6] A. DeSantis and M. Yung. On the design of provably secure cryptographic hash functions. In
Advances in Cryptology--Eurocrypt '90. Lecture Notes in Computer Science, Springer-Verlag,
Berlin, to appear.
1-7] W. Ditlie and M. E. Hellman. New directions in cryptography. IEEE Trans. Inform. Theory,
22: 644-654, Nov. 1976.
1-8] Z. Galil, S. Haber, and M. Yung. Interactive public-key cryptosystems. Submitted for publication,
1990.
[9] S. Goldwasser and S. Micali. Probabilistic encryption. J. Comput. System Sci., 28: 270-299, April
1984.
How To Time-Stamp a Digital Document 111
[lo] S. Goldwasser, S. Micali, and R. Rivest. A secure digital signature scheme. SIAM J. Comput.,
17(2):281-308, 1988.
[l l] A. Grundberg. Ask it no questions: The camera can lie. The New York Times, Section 2, pp. 1,29,
August 12,199O.
[12] R. Impagliazzo, L. Levin, and M. Luby. Pseudorandom generation from one-way functions. In
Proc. 21st STOC, pp. 12-24. ACM, New York, 1989.
[13] R. Impagliazzo and M. Luby. One-way functions are essential for complexity-based cryptography.
In Proc. 30th FOCS, pp. 230-235. IEEE, New York, 1989.
[14] H. M. Kanare. Writing the Laboratory Notebook, p. 117.American Chemical Society, Washington,
D.C., 1985.
[lS] R. C. Merkle. Secrecy,authentication, and public-key systems.Ph.D. thesis, Stanford University,
1979.
[16] R. C. Merkle. One-way hash functions and DES. In Advances in Cryptology-Crypt0 ‘89
(ed. G. Brassard), pp. 428-446. Lecture Notes in Computer Science, vol. 435, Springer-Verlag,
Berlin, 1990.
[17] M. Naor and M. Yung. Universal one-way hash functions and their cryptographic applications.
In Proc. 21st STOC, pp. 33-43. ACM, New York, 1989.
Cl83 M. 0. Rabin. Digitalized signatures. In Foundations of Secure Computation (ed. R. A. DeMillo
et al.), pp. 155-168. Academic Press,New York, 1978.
[19] R. Rivest. The MD4 messagedigest algorithm. In Aduances in Cryptology-Crypt0 ‘90. Lecture
Notes in Computer Science,Springer-Verlag, Berlin, to appear.
[20] J. Rompel. One-way functions are necessary and sutlicient for secure signatures. In Proc. 22nd
STOC, pp. 387-394. ACM, New York, 1990.
[21] C. Shannon. Prediction and entropy of printed English. Bell System Tech. J., 30: 50-64, 1951.
[22] A. C. Yao. Theory and applications of trapdoor functions. In Proc. 23rd FOCS, pp. 80-91. IEEE,
New York, 1982.