HCP REST Developer Guide v2.1.0
HCP REST Developer Guide v2.1.0
Developer’s Guide
By HCP Engineering
Page |2
2.6.2 Creating Custom Metadata ...................................................................................................................... 23
Page |3
4.2.1 Concurrent Access .................................................................................................................................. 38
5 Conclusions ........................................................................................................................................................... 48
6 Appendix ................................................................................................................................................................ 49
Page |4
1 Introduction
1.1 Summary
Hitachi Content Platform (HCP) is a distributed storage system designed to support large, growing repositories of
fixed-content data. HCP stores objects that include both data and metadata. It distributes these objects across the
storage space but still presents them as files in a standard directory structure. HCP provides access to stored objects
through the HTTP protocol, as well as through an integrated search facility.
This document provides Hitachi solution consultants, partners, and integrators with a step-by-step guide on working
with the HCP REST API and is applicable to HCP version 3.0 or later
This document is intended to be a quick start guide and by no means an exhaustive resource on developing
against the HCP API. For comprehensive documentation please refer to the resources that have been included
with your installation of HCP.
The examples in this section are written in the Java programming language and use Apache HTTP Client, an open
source HTTP library.
A helpful tool when using Apache HTTP Client for development is to enable wire and context logging. This is
strongly recommended to see the traffic passed to and from HCP. In order to enable logging add the following to
your JVM process arguments.
-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.SimpleLog
-Dorg.apache.commons.logging.simplelog.showdatetime=true
-Dorg.apache.commons.logging.simplelog.log.org.apache.http=DEBUG
-Dorg.apache.commons.logging.simplelog.log.org.apache.http.wire=ERROR
Page |6
2.1 Creating an Object
Namespaces provide support for a layer of data separation, allowing several applications to safely and securely share
the same tenant. It’s important to note that only authenticated users can access an HCP namespace unless
anonymous access is enabled. Also please note that these examples use HTTP which must be enabled on the
namespace
Let's get started. Our first program uploads content from the file hello.txt to an HCP repository. This file is a ASCII
file with the following text:
Hello world!
We’ll “PUT” our new object in the path “examples/world.txt” in the tenant called “tn01” with a namespace “ns01”.
Because this HCP namespace requires authentication, you will also need to present security credentials with each
HTTP request. Below is the code used in this example:
Page |7
1 package com.hds;
2
3 import java.io.*;
4 import java.util.ArrayList;
5 import java.util.List;
6
7 import org.apache.http.client.entity.UrlEncodedFormEntity;
8 import org.apache.http.client.HttpClient;
9 import org.apache.http.client.methods.HttpDelete;
10 import org.apache.http.client.methods.HttpGet;
11 import org.apache.http.client.methods.HttpPost;
12 import org.apache.http.client.methods.HttpPut;
13 import org.apache.http.entity.ByteArrayEntity;
14 import org.apache.http.HttpResponse;
15 import org.apache.http.impl.client.HttpClientBuilder;
16 import org.apache.http.message.BasicNameValuePair;
17 import org.apache.http.NameValuePair;
18
19 public static void put() throws IOException{
20 //specify namespace URL - eg. ns01.tn01.hcp01.hitachi.com/rest/path
21 String url = "http://ns01.tn01.hcp01.hitachi.com/rest/examples/world.txt";
22
23 //specify path to file you want to upload(PUT)
24 String localFilePath = "/Users/japark/Development/Java/HCPJavaRestSamples/world.txt";
25
26 //create a new HttpClient object and a PUT request object
27 HttpClient client = HttpClientBuilder.create().build();
28 HttpPut request = new HttpPut(url);
29
30 //add authorization header for user(base64) "exampleuser" with password(md5) "passw0rd"
31 request.addHeader("Authorization", "HCP
32 ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb");
33
34 //setup byte array for file to upload(PUT)
35 File input = new File(localFilePath);
36 byte[] fileAsByteArr = Utils.fileToByteArray(input);
37
38 ByteArrayEntity requestEntity = new ByteArrayEntity(fileAsByteArr);
39
40 //set the request to use the byte array
41 request.setEntity(requestEntity);
42 //execute PUT request
43 HttpResponse response = client.execute(request);
44
45 //print response status to console
46 System.out.println("Response Code : "
47 + response.getStatusLine().getStatusCode() + " " +
48 response.getStatusLine().getReasonPhrase());
49
50 }
Page |8
When you run this program, the output from your machine should look something like this
First lets direct your attention to the URL you used from your namespace program. In order to perform an operation
against HCP’s REST API you will always need to specify a “resource” by supplying a URL. In this example you can
see that we are using the “ns01” namespace located within the “tn01” tenant which is on the HCP system located
at “hcp01.hitachi.com”. In addition we can see that we are using the “REST” interface and addressing the path
“examples/world.txt”. See below for a visual representation of the URL anatomy.
Next, lets direct your attention to how we authenticated against HCP. HCP supports a number of different
authentications options such as Active Directory pass through, SPNEGO, and local authentication. In this example
we authenticated ourselves using a local account with the user “exampleuser” with a password of “passw0rd”. In
order to pass this information to HCP we generated an “Authorization” header with the value of “HCP
ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb”. Please note that we have taken the Base64 value of
the username, and the md5 hash of the password and concatinated the two values placing the colon character
between them. We have also prepended “HCP” to the generated values.
base64(exampleuser) ZXhhbXBsZXVzZXI=
md5(passw0rd) bed128365216c019988915ed3add75fb
HCP uses the Host header to determine which namespace you are writing to. Apache automatically populates the
Host header, extracting the information from the URL you provide.
Page |9
WARNING!
If you are not using a library to handle HTTP for you, be sure to transmit
a properly formatted Host header with your request. HCP will reject your
transaction otherwise.
WARNING!
Object data is attached to the body of a PUT request. Apache provides built-in functions to read data from a
byte array for upload.
33 //setup.byte.array.for.file.to.upload(PUT)
34 File.input = new File(loaclFilePath):
35 byte[] fileAsByteArr = Utils.fileToByteArray(input);
36
37 ByteArrayEntity.requestEntity = new ByteArryEntity(fileAsByteArr);
38
39 //set the request to us the byte array
40 request.setEntity(requestEntity);
P a g e | 10
2.1.2 Interpreting Responses
HCP uses standard HTTP/1.1 response codes to communicate transaction status. HCP error responses will not
provide great detail in to the underlying cause for security purposes. A 403 code for example could mean bad
credentials, a namespace or tenant does not exist, or that a user does not have the required permissions.
That being said, common return values for PUT requests include
Apache maps HTTP responses to.specific error codes by default. For instance, the response
HttpStatus.SC_CREATED
For more recommendations on proper error handling, see Section 4.1: Handling Errors.
HCP returns additional processing details in the header block. Let’s interpret the HTTP response to our program.
The X-HCP-HASH header is a checksum for data being sent to HCP for the hash algorithm specified for the
namespace. The ETAG header is an md5 checksum for the data sent to HCP. When a request arrives the repository
calculates a checksum for the object data received, which it returns as part of the HTTP response. The checksum
can be used by applications to ensure that data uploaded by the PUT request wasn't corrupted in-transit. Sample
code to authenticate content is included in Section 3.3.1: Authenticity and Checksums.
P a g e | 11
1 public static void get() throws IOException{
2 //specify namespace URL - eg. ns01.tn01.hcp01.HCP.hitachi.com/rest/path
3 String url = "http://ns01.tn01.hcp01.hitachi.com/rest/examples/world.txt";
4
5 //create a new HttpClient object and a GET request object
6 HttpClient client = HttpClientBuilder.create().build();
7 HttpGet request = new HttpGet(url);
8
9 //add authorization header for user(base64) "exampleuser" with password(md5) "passw0rd"
10 request.addHeader("Authorization", "HCP
11 ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb");
12
13 //execute the request
14 HttpResponse response = client.execute(request);
15
16 //print response status to console
17 System.out.println("Response Code : "
18 + response.getStatusLine().getStatusCode() + " " +
19 response.getStatusLine().getReasonPhrase());
20
21 //get response content
22 BufferedReader rd = new BufferedReader(
23 new InputStreamReader(response.getEntity().getContent()));
24
25 //print response content to console
26 StringBuffer result = new StringBuffer();
27 String line = "";
28 while ((line = rd.readLine()) != null) {
29 result.append(line);
30 }
31 System.out.println(result.toString());
32 }
P a g e | 12
When you run this program, the HTTP request from your machine should look something like this:
The GET method is used to retrieve existing objects. You can instruct Apache to issue an HTTP GET request by
creating an HttpGet object.
Lets take a look at the URL we used to GET the object data. You should notice that the URL used is the same as when
we PUT the object into the namespace. A visual representation of the anatomy of a GET request is below.
200 OK Success
X-HCP-Time: 1259584200
Content-Type: text/plain
Content-Length: 12
X-HCP-Type: object
X-HCP-Size: 12
X-HCP-Hash: SHA-256 D2A84F4B8B6...
X-HCP-VersionId: 80205544854849
X-HCP-IngestTime: 1258469614
X-HCP-RetentionClass:
X-HCP-RetentionString: Deletion Allowed
X-HCP-Retention: 0
X-HCP-RetentionHold: false
X-HCP-Shred: false
X-HCP-DPL: 2
X-HCP-Index: false
P a g e | 13
X-HCP-Custom-Metadata: false
For a list of commonly encountered headers and their function, see Section 6.3.4.1: Authenticated Namespaces. For
a full description of HCP response headers, see Hitachi Content Platform: Using a Namespace.
Object data is attached to the body of a GET response. Apache provides built-in functions that allow you to grab the
data to write to disk. In the following code we write the content to “/tmp/outfile.txt”. Please note that we are using a
BufferedInputStream for faster performance.
P a g e | 14
2.3 Deleting an Object
2.3.1 Using a Namespace
The REST API can be used to delete objects from a repository. The next program deletes the object
examples/world.txt from the namespace.
P a g e | 15
When you run this program, the HTTP request from your machine should look something like this:
The DELETE method is used to delete objects. You can instruct Apache to issue an HTTP DELETE request
by creating an HttpDelete object.
200 OK Success
Objects under retention or on legal hold cannot be deleted by normal means. See Section 3.4: Retaining Objects for
more details.
2.4.1 Conditionals
HCP allows for conditional operations when making a request. In doing so you can greatly optimize the number of
operations used against HCP. For example, a user might choose to only want download a file if HCP has a different
version of the file than what is stored locally. In making the operation conditional, we can save both the time
required for a response to finish, as well as require fewer operations to be used in general.
P a g e | 16
2.4.2 Available Headers
P a g e | 17
2.4.3 Put operation with condition
Let’s update our program from Section 2.1.1 to only PUT an object if the current version of the object matches the
md5 hash given.
P a g e | 18
In the above example, HCP will evaluate the current version of the file world.txt and compare it with the hash
we provided as a header in line 14. If the current version does not match hash, you should see the following.
Note that in the above code, we added the “If-None-Match” header as well as the “Expect” header. This tells Apache
to only send the body of the request if a “100-Continue” response is received. Apache will handle the interim steps for
you.
retention Any valid retention expression as Specify retention settings for an object.
defined in Section 3.4: Retaining
Objects
This section teaches you how to manipulate system metadata. To better understand how HCP uses system
metadata to manage the retention of objects, see Section 3.4: Retaining Objects.
P a g e | 19
12
13 //add authorization header for user(base64) "exampleuser" with password(md5) "passw0rd"
14 request.addHeader("Authorization", "HCP ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb");
15
16 //setup byte array for file to upload(PUT)
17 File input = new File(localFilePath);
18 byte[] fileAsByteArr = Utils.fileToByteArray(input);
19
20 ByteArrayEntity requestEntity = new ByteArrayEntity(fileAsByteArr);
21
22 //set the request to use the byte array
23 request.setEntity(requestEntity);
24 //execute PUT request
25 HttpResponse response = client.execute(request);
26
27 //print response status to console
28 System.out.println("Response Code : "
29 + response.getStatusLine().getStatusCode() + " " + response.getStatusLine().getReasonPhrase());
30 }
31
P a g e | 20
Overrides are specified by name as URL query parameters.
Any combination of ampersand-delimited properties can be included in the PUT request. As always, make sure your
data access account has sufficient privileges to create objects and set metadata, especially when enabling privileged
features like indexing and legal holds.
P a g e | 21
When you run this program, the HTTP request from your machine should look like this:
Objects are addressed using the standard URL formats for HCP namespaces.
System metadata is included in the body as form-encoded data. Use the name value pairs to format and attach form-
encoded data to an HTTP transaction.
P a g e | 22
2.5.4 Interpreting Responses
HCP uses standard HTTP/1.1 response codes to communicate transaction status. Common return values for
POST requests include:
400 Bad Request The request is trying to change the retention setting from a
retention class to an explicit setting, such as an absolute Epochal
date.
The request is trying to set the shred value from true to false.
404 Not Found HCP could not find the specified object
P a g e | 23
1 public static void addMetaData() throws IOException{
2 //specify namespace URL - eg. ns01.tn01.hcp01.hitachi.com/rest/path
3 String url = "http://ns01.tn01.hcp01.hitachi.com/rest/examples/world.txt?" +
4 "type=custom-metadata&annotation=myannotation";
5
6 //specify path to file you want to upload(PUT)
7 String localFilePath =
8 "/Users/japark/Development/Java/HCPJavaRestSamples/myannotation.xml";
9
10 //create a new HttpClient object and a PUT request object
11 HttpClient client = HttpClientBuilder.create().build();
12 HttpPut request = new HttpPut(url);
13
14 //add authorization header for user(base64) "exampleuser" with password(md5) "passw0rd"
15 request.addHeader("Authorization", "HCP
16 ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb");
17
18 //setup byte array for file to upload(PUT)
19 File input = new File(localFilePath);
20 byte[] fileAsByteArr = Utils.fileToByteArray(input);
21
22 ByteArrayEntity requestEntity = new ByteArrayEntity(fileAsByteArr);
23
24 //set the request to use the byte array
25 request.setEntity(requestEntity);
26 //execute PUT request
27 HttpResponse response = client.execute(request);
28
29 //print response status to console
30 System.out.println("Response Code : "
31 + response.getStatusLine().getStatusCode() + " " +
32 response.getStatusLine().getReasonPhrase());
33 }
P a g e | 24
The well-formed XML file we are uploading contains the following content.
When you run this program, the HTTP request from your machine should look like this:
<?xml version=”1.0”?>
<world-params>
<cruelness unit=”skovels”>6371</cruelness>
<favorite-color>
<blue/>
<yellow/>
</favorite-color>
</world-params>
The PUT method is used to attach custom metadata to an object. Assigning the URL parameter type to a value of
custom-metadata identifies uploaded content as custom metadata. Please also note that we have also added the annotation
parameter with a value of my-annotation. HCP allows for up to ten annotations to be attached as custom metadata to each
object stored. If the annotation is well-formed XML it can be up to 1 MB in size. HCP will also accept any other file type with
a maximum size of 1 GB.
P a g e | 25
The GET method is used to retrieve custom metadata from an of object. Assigning the URL parameter type to a value
custom-metadata indicates that custom metadata should be we have returned in the response body. Please also note that
also added the annotation parameter with a value of my-annotation.
P a g e | 26
3 Planning Your Application
By now you should have a basic understanding of how to work with a repository using the HCP REST API. In this
section, we examine some best practices to follow when building fault-tolerant, production-ready applications.
HCP periodically updates DNS records to reflect which storage nodes are up, running, and actively accepting traffic.
Here’s an example query for the IP address of the hostname www.hcp01.hitachi.com.
$ dig www.hcp01.hitachi.com
; <<>> DiG 9.6.1-P2 <<>> www.hcp01.hitachi.com ;;
global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20452
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.hcp01.hitachi.com. IN A
;; ANSWER SECTION:
www.hcp01.hitachi.com. 15 IN A 172.20.2.74
www.hcp01.hitachi.com. 15 IN A 172.20.2.75
www.hcp01.hitachi.com. 15 IN A 172.20.2.76
www.hcp01.hitachi.com. 15 IN A 172.20.2.77
www.hcp01.hitachi.com. 15 IN A 172.20.2.78
www.hcp01.hitachi.com. 15 IN A 172.20.2.79
www.hcp01.hitachi.com. 15 IN A 172.20.2.80
www.hcp01.hitachi.com. 15 IN A 172.20.2.81
As you can see, the HCP system is accessible from eight storage nodes. If a storage node becomes
unavailable, HCP will remove its IP address from the DNS record for www.hcp01.hitachi.com.
HCP uses a load balancing technique called client-side round-robin DNS. Addressing a repository by its qualified
hostname will automatically distribute load amongst available storage nodes. Let’s take a look at an illustration of
the process.
P a g e | 27
Applications are given a different answer each time they query DNS for the hostname www.hcp01.hitachi.com.
Successive queries resolve to different IP addresses. HCP shuffles the authoritative answer list for each DNS
request it receives. This behavior distributes application traffic across the system, spreading load equally across the
storage nodes.
By contrast, directly addressing the system by IP address will result in your application contacting the same storage
node for every request it issues. This can lead to performance bottlenecks and creates a single point of failure in the
overall solution.
WARNING!
Make sure that both your operating system and runtime environment are
not caching DNS results. Your application will communicate with the
same storage node while its IP address remains in-cache. This may result
in performance bottlenecks.
HCP also works with commercial load balancing appliances. Contact a Hitachi Solutions Consultant for more
information on integrating HCP with hardware load balancing equipment.
P a g e | 28
3.1.2 Fault Tolerance
HCP is a distributed object store capable of reliably operating with multiple component failures. Applications, however,
need to know when a storage node is no longer available. Sending HTTP requests to an unresponsive IP address will
cause application errors, retry delays, and performance outages.
HCP regularly monitors components for failure and periodically updates DNS records to reflect their availability.
Addressing the system by its qualified hostname will automatically re-direct you away from unresponsive storage
nodes. Once a failure is detected, HCP will remove the offending IP address from the DNS answer list and divert
new HTTP transactions away from the misbehaving component.
You remain open to the risk of connecting to unresponsive storage nodes in the time window between a fault’s
occurrence and its detection. Applications should follow best practices detailed in Section 4.1: Handling Errors
to properly handle runtime errors and minimize their impact.
Applications typically read and write data from the primary, and use the secondary for disaster recovery and/or
load balancing purposes.
If the event of a serious service outage at Datacenter A or issues with hcp01, application servers in both datacenters
will need to direct traffic to the failover site.
P a g e | 29
http://ct.pacs.hcp01.hitachi.com/rest/P293/S321/c7.dcm
http://ct.pacs.hcp02.hitachi.com/rest/P293/S321/c7.dcm
refer to the same object on the primary and backup systems, respectively. Applications can affect a passive
service failover by modifying the hostname in the object URL. Storing the primary and backup system hostnames
in a configuration file and building URLs to access objects at runtime is strongly recommended.
You should carefully consider how your application stores and assembles object URLs. Independently
managing HCP hostnames and object paths can help you build fault-tolerant, replication-friendly applications.
As of HCP version 7, active-active replication is supported. This allows for read and write operations to occur against
either one of our HCP systems in our example. In such cases, HCP will hyper replicate metadata from one system to
another and then replicate object data. In this failover situation writes can still occur against the system you’ve failed
over to.
Objects are assigned to regions by their directory path. For instance, the objects
http://ns01.tn01.hcp01.hitachi.com/rest/examples/disco/earth.jpg
http://ns01.tn01.hcp01.hitachi.com/rest/examples/disco/wind.jpg
http://ns01.tn01.hcp01.hitachi.com/rest/examples/disco/fire.jpg
belong to the same region because they share the path /examples/disco. The objects
http://ns01.tn01.hcp01.hitachi.com/rest/examples/disco/earth.jpg
http://ns01.tn01.hcp01.hitachi.com/rest/examples/songs/september.mp3
Creating an object requires updates to its corresponding region database to record system metadata. This
includes information like data placement and retention properties. HCP uses region databases to physically locate
objects amongst the large number of disks, LUNs, and storage nodes under management.
P a g e | 30
When developing applications with high transaction rates, it is crucial to spread object metadata ownership across all
available regions. Doing so distributes potential metadata queries to all region managers in the system and insures a handful
of storages node aren’t stuck with the burden of serving layout information for all objects in the repository.
You should carefully consider your application’s object naming scheme and directory layout. Varying directory paths
spreads object management workload evenly across storage nodes. Concentrating objects to a small set of directory
paths can lead to network resource contention at a small set of region managers. Spreading them across a broader
directory structure will balance back-end network traffic, reduce I/O bottlenecks, and improve performance for
metadata-intensive I/O workloads.
Like most storage systems, HCP performs optimally within a set of engineered limits. In particular,
applications should be aware of published guidelines concerning
P a g e | 31
maximum directories per configured system, and
maximum objects per directory
Release notes documenting the latest recommended limits are published with every HCP release. You should
carefully consider your directory layout to insure these limits are not exceeded during the course of normal operation.
P a g e | 32
3.3.2 Reliability
HCP provides a few ways to customize data reliability levels within a repository:
RAID – All data stored on HCP is protected by RAID volumes. This is a standard protection
mechanism applied to all data on the repository.
Data Protection Level – HCP can be configured to maintain multiple copies of an object to provide a higher
degree of protection against data loss and corruption. The number of copies is known as the data protection
level, or DPL. As of HCP 7.0 DPL can be configured via service plans, allowing different applications
accessing the same repository to have different levels of data reliability.
Replication – For customers with the most stringent reliability needs, data can be replicated between HCP
systems. Replication is a must for applications with disaster recovery requirements. Replication can be
configured on a tenant-by-tenant basis.
You should carefully consider your data reliability needs, and consult your HCP administrator to help configure
the system to meet them.
Retention settings can be expressed in a number of ways. HCP accepts two formats for specifying an absolute date
until which objects must be retained.
Format Description
1279141446
2010-07-14T21:04:05-0000
will cause the object examples/prunes.jpg to be retained until July 14, 2010 21:04:05 UTC.
As a convenience, retention can also be expressed as an offset from a well-known time. HCP supports a
simple convention that defines relative retention. Definitions begin with a well-known time
P a g e | 33
Symbol Description
followed by an offset.
Symbol Description
y Years
M Months
w Weeks
d Days
h Hours
m Minutes
s Seconds
Together they define an absolute date for which an object must be retained. Offset symbols can be mixed and
matched to provide an accurate representation of your retention needs. For example,
String url = “http://example-ns01.example-tn01.hcp01.hitachi.com/rest/examples/world.txt"
+ “?retention=N+3Y-2M”;
will cause the object examples/prunes.jpg to be retained for 2 years and 10 months from the current time. HCP will
perform date-based arithmetic for you (in the order specified) to calculate the absolute timestamp it ultimately assigns.
A detailed summary of the retention format can be found in Section 6.3.5.
Let’s look at an example. The administrator for the namespace ct.pacs.hcp01.hitachi.com repository needs to retain
objects stored in his namespace for a minimum of seven years. He creates a new retention class called
22CCR
to reflect the name of the regulation which drives his data retention needs. The syntax defining the retention
period should look familiar:
A+7y
Data bound to 22CCR will be retained for seven years from time of creation. Applications storing compliance data in
the ct.pacs namespace attach their object to the 22CCR retention class by using the retention URL parameter:
P a g e | 34
String url = “http://example-ns01.example-tn01.hcp01.hitachi.com/rest/examples/world.txt"
+ “?retention=C+22CCR”;
Retention classes – like retention settings – are system metadata. You can use the HCP REST API to manage
system metadata for an object. See Section 2.4:
NOTE
If a retention class is deleted, the retention periods for objects will be set
to “deletion not allowed”. If a new retention class is set with the same
name the objects will inherit the retention settings of the new class.
A namespace in compliance mode behaves like a traditional archive; any attempt to delete an object under retention
is rejected.
A namespace in enterprise mode is more flexible, and allows for audited deletion of objects under retention. Deleting
objects under retention from a namespace in enterprise mode requires Privileged Permissions. Privileged operations
work on an object-by-object basis and require you to specify a reason for each privileged transaction.
Let’s modify the program from Section 2.3 to perform a privileged delete.
P a g e | 35
You can specify the arguments privileged and reason as URL query parameters, or as form-encoded data.
NOTE
Legal holds are useful when objects are needed for legal discovery. See Hitachi Content Platform: Using a
Namespace for more information on placing objects on legal hold.
WARNING!
Legal holds do not affect custom metadata. Objects under legal hold
can have their custom metadata modified without restriction.
3.5.1 HTTPS
Customers with stringent internal data security requirements should strongly consider using HTTPS as a transport
protocol, especially when accessing objects in an HCP namespace. HCP namespaces require you to transmit static
security tokens as part of the HTTP header block. Attackers with internal access to your network infrastructure could
passively monitor network traffic to extract account credentials from unsecured REST transactions. Using HTTPS to
encrypt traffic makes it much more difficult for attackers to gain access to account passwords and impersonate valid
users. HTTPS imposes a modest impact on performance.
P a g e | 36
extraction through the physical theft of one or more HCP components. Data-at-rest encryption imposes a impact on
over HCP performance.
P a g e | 37
4 Advanced Topics
4.1 Handling Errors
4.1.1 Retrying Transactions
You should retry an HTTP transaction if you are
In the first two scenarios the most likely cause is the unanticipated failure of the storage node you are connected
to. HCP monitors the system at periodic intervals and diverts traffic from misbehaving components. You should
wait several seconds before retrying your transaction to provide HCP sufficient time to detect and correct any
issues it encounters.
1 15 secs
2 30 secs
3 60 secs
deletable, and
overwritable
As recommended in Section 4.1.1: Retrying Transactions, you should retry failed PUT requests in the event
the transaction fails before a valid response code is issued.
P a g e | 38
NOTE
Increasing the thread count and effective parallelism of your application will intensify load on HCP. Be sure to follow the
guidelines found in Section 3.1.1: Load Balancing to properly distribute traffic across storage nodes in the system.
Threads working in parallel should be directed against different directories, if possible. Section 3.2.1:
Region Management illustrates the impact of high traffic volume when leveraged against a single directory.
Chunked transfers impose a modest performance penalty relative to normal PUT requests. You should
carefully consider your application architecture before using this feature.
NOTE
P a g e | 39
1 public static void get() throws IOException {
2 //specify namespace URL - eg. namespace.tenant.HCP.DOMAIN.com/rest/path
3 String url = "http://example-namespace.example-tenant.cluster59h-
4 vm3.lab.archivas.com/rest/examples/world.txt";
5
6 //create a new HttpClient object and a GET request object
7 HttpClient client = HttpClientBuilder.create().build();
8 HttpGet request = new HttpGet(url);
9
10 //add authorization header for user(base64) "exampleuser" with password(md5) "passw0rd"
11 request.addHeader("Authorization", "HCP ZXhhbXBsZXVzZXI=:bed128365216c019988915ed3add75fb");
12 request.addHeader("Range", "bytes=0-4");
13
14 //execute the request
15 HttpResponse response = client.execute(request);
16
17 //print response status to console
18 System.out.println("Response Code : "
19 + response.getStatusLine().getStatusCode() + " " +
20 response.getStatusLine().getReasonPhrase());
21
22 //get response content
23 BufferedReader rd = new BufferedReader(
24 new InputStreamReader(response.getEntity().getContent()));
25
26 //print response content to console
27 StringBuffer result = new StringBuffer();
28 String line = "";
29 while ((line = rd.readLine()) != null) {
30 result.append(line);
31 }
32 System.out.println(result.toString());
33 }
P a g e | 40
NOTE
4.4 Versioning
By default, a new version of an object inherits metadata values from its immediate predecessor. You can override these
values by manually specifying different metadata values when you create the new version. See Section 2.4:
Versions are created using PUT requests. Section 2.1: Creating an Object provides a step-by-step tutorial on
creating new objects.
X-HCP-VersionId: 79885459513089
X-HCP-VersionId is used to explicitly retrieve historic versions of objects. Let’s modify the program listing in Section 2.2.1
to retrieve a version of the object examples/world.txt with version identifier 79885459513089.
The GET method is used to retrieve historical versions of objects. The ID of the version to retrieve is specified in-
line as a URL parameter. The code required to download a specific version of an object otherwise remains the
same as the code required to download the current version. As always, object data is returned in the response body
on success.
NOTE
You cannot delete individual historic versions of an object. You can only
purge them. When an object is purged from a repository, all its historic
versions are deleted.
Purging an object is similar to deleting one. Let’s modify the program in Section 2.3 to purge the
object examples/world.txt from the HCP namespace Ns01.
P a g e | 41
String url = http://ns01.tn01.hcp01.hitachi.com/rest/examples/world.txt?"
+ "purge-true”;
The DELETE method is used to purge an object. The argument purge explicitly asks for the object to be purged from
the repository. It is specified in-line as a URL parameter.
If an object is on legal hold or under retention, a privileged purge is required to purge the object from the repository.
Privileged purges are described in greater depth in Section 3.4: Retaining Objects.
The GET method is used to download version lists. Setting the URL query parameter version to a value of list
requests a version list to be returned in the response body.
An XML document listing all versions of the object is returned in lieu of object data:
P a g e | 42
customMetadata="false"
state="deleted"
version="80232488876481"
/>
</versions>
In this example we see that the object examples/world.txt has two versions: the current version and a historic one.
Metadata for each is returned as XML attributes. The version ID field can be used to retrieve a particular version of
the object. See Section 4.4.2: Retrieving Versions for more details on retrieving historic versions of an object.
For a summary of the XML format for version lists, see Section 6.3.6.1: Object versions. For a complete description of
the version list format, see Hitachi Content Platform: Using a Namespace.
P a g e | 43
When you run this program, the HTTP request from your machine should look something like this:
The HEAD method is used to retrieve information about an object. You can instruct Apache to issue HTTP HEAD
requests by creating an HttpHead object.
HEAD requests are used to determine if an object exists in a namespace, and returns system metadata for it if it does.
HTTP/1.1 200 OK
X-HCP-Time: 1259584200
Content-Type: text/plain
Content-Length: 12
X-HCP-Type: object
X-HCP-Size: 12
X-HCP-Hash: SHA-256 D2A84F4B8B6...
X-HCP-VersionId: 80205544854849
X-HCP-IngestTime: 1258469614
X-HCP-RetentionClass:
X-HCP-RetentionString: Deletion Allowed
X-HCP-Retention: 0
X-HCP-RetentionHold: false
X-HCP-Shred: false
X-HCP-DPL: 2
X-HCP-Index: false
X-HCP-Custom-Metadata: false
For a list of commonly encountered headers and their function, see Section 6.3.3: Authenticated Namespaces. For a
full description of HCP response headers, see Hitachi Content Platform: Using a Namespace.
Apache does not print HTTP response headers by default. You can enable the output by adding the flags
described in Section 2.
When you run this program, the HTTP request from your machine should look something like this:
The GET method is used to retrieve directory listings. Setting the URL parameter type to the value directory will
cause HCP to return an XML document listing the contents of a directory.
Directory listings do not included deleted directories or objects by default. If the namespace supports versioning or
did in the past, you can request that previously deleted items be included in the result list.
String url = “http://example-ns01.example-tn01.hcp01.hitachi.com/rest/examples/?”
+ “type=directory&deleted=true”;
P a g e | 44
parentDir="/rest">
utf8ParentDir="/rest"
dirDeleted="false"
showDeleted="true">
<entry urlName="world.txt"
utf8Name="world.txt"
type="object"
size="12"
hashScheme="SHA-256"
hash="D2A84F4B8B6..."
retention="0"
retentionString="Deletion Allowed"
retentionClass=""
ingestTime="1258392981"
ingestTimeString="11/16/2009 12:36PM"
hold="false"
shred="false"
dpl="2"
index="true"
customMetadata="false"
version="80238375537921"
state="created"
/>
<entry urlName="obsolete"
utf8Name="obsolete"
type="directory"
state="deleted"
/>
</directory>
The GET method is used to access namespace information. The path in the URL specifies the type of
information being requested.
Path Summary
P a g e | 45
NOTE
Namespace information is returned as an XML document in the response body. Let’s modify the program in Section
2.2.1 to download statistics for the HCP namespace ns01.
P a g e | 46
HCP should return an XML document that looks like this:
For a summary of the XML formats used by HCP to return namespace information, see Section 6.3.5: XML
Documents. For a full accounting of the XML formats, see Hitachi Content Platform: Using a Namespace.
P a g e | 47
5 Conclusions
By now you should know enough about HCP and the REST API to begin planning and developing useful applications.
HCP provides a strong foundation to develop highly reliable and secure applications suitable for use in the most
demanding environments.
P a g e | 48
6 Appendix
6.1 Code Availability
The code used in this example is available on publicly on GitHub for review and for use. It can be found at:
https://github.com/jhp612/HCPJavaRestSamples
Please note that the GitHub code has been optimized for use compared to the code used in this document which was
optimized for easy reference.
You can find more information about Apache HTTP Client, including user manuals, API references, and sample code
at http://hc.apache.org/httpclient-3.x/
P a g e | 49
HEAD Checks existence of
Objects
Versions
Directories
Custom metadata
Retrieves
Object metadata
Version metadata
PUT Creates
Objects
Versions
Empty directories
Custom metadata
DELETE Removes
Objects
Versions
Empty directories
6.3.2.1 Cookie
The Cookie header is used to transmit authentication tokens to HCP for validation. Valid tokens allow access to HCP
namespaces. Authentication tokens are defined as the base64 encoding of a valid account name concatenated with
an MD5 hash of its password, using a colon as a delimiter.
See Section 2.1.1 a tutorial on using the Cookie header with the Apache HTTP library.
6.3.2.2 Range
The Range header is used for partial content transfers. User-defined byte ranges and offsets can be passed as
parameters in GET requests to vary the amount of data returned in the response.
Range: 0-2351
For a full description of range options, see Hitachi Content Platform: Using a Namespace.
6.3.2.3 Host
The Host header is used to identify an HCP system, its namespaces, and tenants. Namespaces are addressed by
hostname, arranged hierarchically as follows:
P a g e | 50
Host: namespace.tenant.repository.domain.suffix
Host is used in conjunction with the Cookie header to authorize access to HCP namespaces. See Section 2.1.1 for a step-
by-step tutorial on using the Host header.
6.3.2.4 Transfer-Encoding
Transfer-Encoding is used for chunked content transfers.
Transfer-Encoding: chunked
Chunked transfers allow a client to upload data to the repository without initially knowing the total content length.
For more information on using chunked transfers, see Section 4.3.1: Chunked Transfers.
P a g e | 51
401 Unauthorized ALL Insufficient privileges
404 Not Found DELETE Could not find object, version, or directory
GET
HEAD Requested version is the current version of
POST deleted object
PUT
414 Request URI Too ALL URL following /rest prefix longer than 4,095 bytes
Large
416 Requested Range GET Start position greater than size of requested data
Not Satisfied
Byte range is size zero
500 Internal Server Error ALL Internal error. Please contact namespace
administrator
P a g e | 52
X-HCP-DPL Data protection level of the object
X-HCP-Type Object entity type. This is always object for objects and
versions of objects.
P a g e | 53
privileged delete. This setting cannot be changed.
([RAN])?([+-]\d+y)?([+-]\d+M)?([+-]\d+w)?([+-]\d+d)?([+-]\d+h)?([+-]\d+m)?([+-]d+s)?
Symbol Description
N Current time
y Years
M Months
w Weeks
d Days
h Hours
m Months
s seconds
6.3.5.3 Classes
You can bind an object to a retention class by using the syntax
C+class-name
For instance, the syntax for binding an object to the class jamba would be
C+jamba
P a g e | 54
<versions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="/static/xsd/ns-versions.xsd"
path="object-path"
utf8Path="object-path"
parentDir="parent-directory-path"
utf8ParentDir="parent-directory-path"
deleted="true|false"
showDeleted="true|false">
<entry urlname="object-name"
utf8Name="object-name"
type="object"
size="size-in-bytes"
hashScheme="hash-algorithm"
hash="hash-value"
retention="retention-seconds-after-1/1/1970"
retentionString="retention-datetime-value"
retentionClass="retention-class-name"
ingestTime="ingested-seconds-after-1/1/1970"
ingestTimeString="ingested-datetime"
hold="true|false"
shred="true|false"
dpl="dpl"
index="true|false"
custom-metadata="true|false"
state="created|deleted"
version=version-id"
/>
...
</versions>
P a g e | 55
value="retention-value">
autoDelete="true|false"
<description><![CDATA[
class-description
]]></description>
</retentionClass>
...
</retentionClasses>
P a g e | 56