My Report - Understanding Blockchain
My Report - Understanding Blockchain
1-What is blockchain?
Blockchain is a decentralized system of data storage or a ledger of fact distributed in a peer
to peer network, encrypted and ordered by blocks
3-What is a block?
A block is a list of permanent transaction records, they are created by minors. A block
contains the signature of the previous block, a record historic of some or all the recent
transaction and the signature of the next block (which is an answer to a difficult-to-solve
mathematical puzzle). When a minor finish creating a block, it sends it to all the other nodes
in the network. When the other minors receives a new block created by one the minors they
interrupt all their current task and processes the new block. A block can be add to the end
of the chain only if at least 51% of the minors accepts it, once accepted by the network a
block can never be changed or removed, . Blocks are organized into a linear sequence over
time(Blockchain).
3.1
-Structure of a block
4-What is a transaction?
A transaction is an exchange of information that is broadcast to the network and collected
into blocks. it’s the simplest thing that we can record inside a blockchain but not the only
thing. Transactions are not encrypted, so it is possible to browse and view every transaction
ever collected into a block. Example 1: Let’s say a user A sends 10 bitcoins to two other
users B and C, this is called a transaction.
Fig.1: Example of a bitcoin Byte-map of Transaction with each type of TxIn and TxOut.
General format (inside a block) a transaction
Input
An input is a reference to an output from a previous transaction. Multiple inputs are often
listed in a transaction. All of the new transaction's input values (that is, the total coin value
of the previous outputs referenced by the new transaction's inputs) are added up, and the
total (less any transaction fee) is completely used by the outputs of the new transaction.
Previous tx is a hash of a previous transaction. Index is the specific output in the referenced
transaction. ScriptSig is the first half of a script.
The script contains two components, a signature and a public key. The public key must
match the hash given in the script of the redeemed output. The public key is used to verify
the redeemers (correcteurs) signature, which is the second component. More precisely, the
second component is an ECDSA signature over a hash of a simplified version of the
transaction. It, combined with the public key, proves the transaction was created by the real
owner of the address in question. Various flags define how the transaction is simplified and
can be used to create different types of payment.
Output
An output contains instructions for sending bitcoins. Value is the number of Satoshi (1 BTC =
100,000,000 Satoshi) that this output will be worth when claimed. ScriptPubKey is the
second half of a script (discussed later). There can be more than one output, and they share
the combined value of the inputs. Because each output from one transaction can only ever
be referenced once by an input of a subsequent transaction, the entire combined input
value needs to be sent in an output if you don't want to lose it. If the input is worth 50 BTC
but you only want to send 25 BTC, Bitcoin will create two outputs worth 25 BTC: one to the
destination, and one back to you (known as "change", though you send it to yourself). Any
input bitcoins not redeemed in an output is considered a transaction fee; whoever generates
the block will get it.
A sends 100 BTC to C and C generates 50 BTC. C sends 101 BTC to D, and he needs to send
himself some change. D sends the 101 BTC to someone else, but they haven't redeemed it
yet. Only D's output and C's change are capable of being spent in the current state.
Verification
To verify that inputs are authorized to collect the values of referenced outputs, Bitcoin uses
a custom Forth-like scripting system. The input's scriptSig and the referenced output's
scriptPubKey are evaluated (in that order), with scriptPubKey using the values left on the
stack by scriptSig. The input is authorized if scriptPubKey returns true. Through the scripting
system, the sender can create very complex conditions that people have to meet in order to
claim the output's value. For example, it's possible to create an output that can be claimed
by anyone without any authorization. It's also possible to require that an input be signed by
ten different keys, or be redeemable with a password instead of a key.
Types of Transaction
Bitcoin currently creates two different scriptSig/scriptPubKey pairs. These are described
below.
It is possible to design more complex types of transactions, and link them together into
cryptographically enforced agreements. These are known as Contracts.
Pay-to-PubkeyHash
scriptPubKey: OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
scriptSig: <sig> <pubKey>
A Bitcoin address is only a hash, so the sender can't provide a full public key in scriptPubKey.
When redeeming coins that have been sent to a Bitcoin address, the recipient provides both
the signature and the public key. The script verifies that the provided public key does hash to
the hash in scriptPubKey, and then it also checks the signature against the public key.
Pay-to-Script-Hash
scriptPubKey: OP_HASH160 <scriptHash> OP_EQUAL
scriptSig: ..signatures... <serialized script>
P2SH addresses were created with the motivation of moving "the responsibility for supplying
the conditions to redeem a transaction from the sender of the funds to the redeemer. They
allow the sender to fund an arbitrary transaction, no matter how complicated, using a 20-
byte hash"1. Pay-to-Pubkey-hash addresses are similarly a 20-byte hash of the public key.
Fig.2: Example of a sends 100 BTC to C and C generates 50 BTC. C sends 101 BTC to D, and he
needs to send himself some change. D sends the 101 BTC to someone else, but they haven't
redeemed it yet. Only D's output and C's change are capable of being spent in the current
state..
I need to notify that every record inside a ledger is linked to his creator via the creator’s
public key and those transactions are encrypted by the creator’s private key.
5.1-Why ordered?
In the example 1, if A’s funds is only 10 bitcoins, C cannot receive the bitcoins because the
historic of the user A will prove that he already sent 10 bitcoins to the user B and he does
not have sufficient funds to make a transaction with user C, this due to fact that his previous
transactions in all the blockchain ledgers are ordered
6- Mining
Mining is the process of adding transaction records to Bitcoin's public ledger of past
transactions. Mining is intentionally designed to be resource-intensive and difficult so that
the number of blocks found each day by miners remains steady. Individual blocks must
contain a proof of work to be considered valid. This proof of work is verified by other Bitcoin
nodes each time they receive a block. Bitcoin uses the hashcash proof-of-work function.
Proof of work
A proof of work is a piece of data which is difficult (costly, time-consuming) to produce but
easy for others to verify and which satisfies certain requirements. Hashcash proofs of work
are used in Bitcoin for block generation. In order for a block to be accepted by network
participants, miners must complete a proof of work which covers all of the data in the block.
The difficulty of this work is adjusted so as to limit the rate at which new blocks can be
generated by the network to one every 10 minutes. Due to the very low probability of
successful generation, this makes it unpredictable which worker computer in the network
will be able to generate the next block.
For a block to be valid it must hash to a value less than the current target; this means that
each block indicates that work has been done generating it. Each block contains the hash of
the preceding block, thus each block has a chain of blocks that together contain a large
amount of work. Changing a block (which can only be done by making a new block
containing the same predecessor) requires regenerating all successors and redoing the work
they contain. This protects the block chain from tampering.
The most widely used proof-of-work scheme is based on SHA-256 and was introduced as a
part of Bitcoin.
Example: Let's say the base string that we are going to do work on is "Hello, world!". Our
target is to find a variation of it that SHA-256 hashes to a value beginning with '000'. We
vary the string by adding an integer value to the end called a nonce and incrementing it each
time. Finding a match for
"Hello, world!" takes us 4251 tries (but happens to have zeroes in the first four digits):
"Hello, world!0" =>
1312af178c253f84028d480a6adc1e25e81caa44c749ec81976192e2ec934c64
"Hello, world!1" =>
e9afc424b79e4f6ab42d99c81156d3a17228d6e1eef4139be78e948a9332a7d8
"Hello, world!2" =>
ae37343a357a8297591625e7134cbea22f5928be8ca2a32aa475cf05fd4266b7
...
"Hello, world!4248" =>
6e110d98b388e77e9c6f042ac6b497cec46660deef75a55ebc7cfdf65cc0b965
"Hello, world!4249" =>
c004190b822f1669cac8dc37e761cb73652e7832fb814565702245cf26ebb9e6
"Hello, world!4250" =>
0000c3af42fc31103f1fdc0151fa747ff87349a4714df7cc52ea464e12dcd4e9
4251 hashes on a modern computer is not very much work (most computers can achieve at
least 4 million hashes per second). Bitcoin automatically varies the difficulty (and thus the
amount of work required to generate a block) to keep a roughly constant rate of block
generation.
5.1-Hashcash
Bitcoin uses the hashcash Proof_of_work function as the mining core. The hashcash
algorithm requires the following parameters: a service string, a nonce, and a counter. In
bitcoin the service string is encoded in the block header data structure, and includes a
version field, the hash of the previous block, the root hash of the merkle tree of all
transactions in the block, the current time, and the difficulty. Bitcoin stores the nonce in the
extraNonce field which is part of the coinbase transaction, which is stored as the left most
leaf node in the merkle tree (the coinbase is the special first transaction in the block). The
counter parameter is small at 32-bits so each time it wraps the extraNonce field must be
incremented (or otherwise changed) to avoid repeating work.
the hashcash algorithm repeatedly hashes the block header while incrementing the counter
& extraNonce fields. Incrementing the extraNonce field entails recomputing the merkle tree,
as the coinbase transaction is the left most leaf node. The block is also occasionally updated
as you are working on it.
All bitcoin miners are expending their effort creating hashcash proofs-of-work which act as a
vote in the blockchain evolution and validate the blockchain transaction log.
Like many cryptographic algorithms hashcash uses a hash function as a building block, in the
same way that HMAC, or RSA signatures are defined on a pluggable hash-function
(commonly denoted by the naming convention of algorithm-hash: HMAC-SHA1, HMAC-MD5,
HMAC-SHA256, RSA-SHA1, etc), hashcash can be instantiated with different functions,
hashcash-SHA1 (original), hashcash-SHA256^2 (bitcoin), hashcash-Scrypt(iter=1) (litecoin).
Double Hash
Bitcoin is using two hash iterations (denoted SHA256^2 ie "SHA256 function squared") and
the reason for this relates to a partial attack on the smaller but related SHA1 hash. SHA1's
resistance to birthday attacks has been partially broken as of 2005 in O(2^64) vs the design
O(2^80). While hashcash relies on pre-image resistance and so is not vulnerable to birthday
attacks, a generic method of hardening SHA1 against the birthday collision attack is to iterate
it twice. A comparable attack on SHA256 does not exist so far, however as the design of
SHA256 is similar to SHA1 it is probably defensive for applications to use double SHA256.
And this is what bitcoin does, it is not necessary given hashcash reliance on preimage
security, but it is a defensive step against future cryptanalytic developments. The attack on
SHA1 and in principle other hashes of similar design like SHA256, was also the motivation for
the NIST SHA3 design competition which is still ongoing.
The hashcash algorithm is relatively simple to understand. The idea builds on a security
property of cryptographic hashes, that they are designed to be hard to invert (so-called one-
way or pre-image resistant property). You can compute y from x cheaply y=H(x) but it's very
hard to find x given only y. A full hash inversion has a known computationally infeasible
brute-force running time, being O(2^k) where k is the hash size eg SHA256, k=256, and if a
pre-image was found anyone could very efficiently verify it by computing one hash, so there
is a huge asymmetry in full pre-image mining (computationally infeasible) vs verification (a
single hash invocation).
A second hash pre-image means given one-preimage x of hash y where y=H(x), the task is to
find another pre-image of hash y: x' so that y=H(x'). This is not to be confused with a
birthday collision which is to find two values x, x' so that H(x)=H(x'), this can be done in much
lower work O(sqrt(2^k))=O(2^(k/2)) because you can proceed by computing many H(x)
values and storing them until you find a matching pair. It takes a lot of memory, but there
are memory-time tradeoffs.
Adding purpose
If the partial-pre-image x from y=H(x) is random it is just a disconnected proof-of-work to no
purpose, everyone can see you did do the work, but they don't know why, so users could
reuse the same work for different services. To make the proof-of-work be bound to a
service, or purpose, the hash must include s, a service string so the work becomes to find
H(s,c)/2^(n-k)=0. The miner varies counter c until this is true. The service string could be a
web server domain name, a recipients email address, or in bitcoin a block of the bitcoin
blockchain ledger.
One additional problem is that if multiple people are mining, using the same service string,
they must not start with the same x or they may end up with the same proof, and anyone
looking at it will not honor a duplicated copy of the same work as it could have been copied
without work, the first to present it will be rewarded, and others will find their work
rejected. To avoid risking wasting work in this way, there needs to be a random starting
point, and so the work becomes to find H(s,x,c)/2^(n-k) = 0 where x is random (eg 128-bits to
make it statistically infeasible for two users to maliciously or accidentally start at the same
point), and c is the counter being varied, and s is the service string.
This is what hashcash version 1 and bitcoin does. In fact in bitcoin the service string is the
coinbase and the coinbase includes the recipients reward address, as well as the
transactions to validate in the block. Bitcoin actually does not include a random start point x,
reusing the reward address as the randomization factor to avoid collisions for this random
start point purpose, which saves 16-bytes of space in the coinbase. For privacy bitcoin
expect the miner to use a different reward address on each successful block.
More Precise Work
Hashcash as originally proposed has work 2^k where k is an integer, this means difficulty can
only be scaled in powers of 2, this is slightly simpler as you can see and fully measure the
difficulty just by counting 0s in hex/binary and was adequate for prior uses. (A lot of
hashcash design choices are motivated by simplicity).
But because bitcoin needs more precise and dynamic control of work (to target 10-minute
block interval accurately), it changes k to be a fractional (floating-point) so the work
becomes to find H(s,x,c) < 2^(n-k) which is equivalent if k is an integer. Bitcoin defines target
= 2^(n-k), so the work can be more simply written to find H(s,x,c) < target. Of course because
of luck the block time actually has quite high variance, but the average is still more
accurately targeted by the introduction of fractional k.
Bitcoin rate of work is called the network hashrate in GH/sec. As the target block interval is
10 minutes that can be converted to cryptographic security as log2(hashrate*600), so that of
Nov 2013 hashrate is 4 petahash/sec and bitcoin's hashcash-256^2 proofs-of-works are 62-
bits (including +1 for double hash).
Bitcoin also defines a new notion of (relative) difficulty which is the work required so that at
current network hashrate a block is expected to be found every 10 minutes. It is expressed
relative to a minimum work unit of 2^32 iterations (approximately, technically minimum
work is 0xFFFF0000 due to bitcoin implementation level details). Bitcoin difficulty is simple to
approximately convert to log2 cryptographic security: k=log2(difficulty)+32 (or for high
accuracy log2(difficulty*0xFFFF0000)). Difficulty is related to the target simply as difficulty =
target / 0xFFFF0000.