Windows Azure and Java: Working with Table Storage

Windows Azure Storage provides three distinct storage services: Tables, Blobs, and Queues. Within a single storage account, you may have any combination of these services, up to the 100TB storage account limit (or create multiple storage accounts). In this article, we’ll focus on the Table Storage service.

Table Storage is a structured storage in the form of tables. It can be used to store large amounts of data that needs some sort of structure. The main difference between Table Storage and databases like SQL Azure is that Table Storage is a non-relational data store and tables in Table Storage are schema-less. As Table Storage does not provide any way to represent relationship between data, it is referred as NoSQL datastore. Data in tables can be quickly accessed using a clustered index. Therefore Table Storage can be used to store huge data for efficient access.

Tables are collections of entities that loosely correspond to rows in SQL Server or SQL Azure. An entity consists of three system properties—partition key, row key, and timestamp—along with a set of custom properties. Including the system properties, an entity can have total 255 properties. The combined size of all properties’ data in an entity cannot exceed 1 MB.

Table Storage has a complete REST API for managing tables, entities and queries. To simplify these tasks, there are several language-specific SDKs that wrap these APIs, including the Windows Azure SDK for Java.

Here we will demonstrate the use of Windows Azure Table Storage service from a Java application, using the Java SDK. Table Storage can be accessed from a Java application running locally or within Windows Azure worker and web role instances. We recently published CloudNinja for Java to github, a reference application illustrating how to build multi-tenant Java based applications for Windows Azure. CloudNinja for Java uses Windows Azure Table Storage for storing system performance counters.

This blog post demonstrates how you can perform the following tasks:

  • Create and delete a table
  • Insert and delete an entity
  • Retrieve entities from a table

Prerequisites

The prerequisites for using Windows Azure Table Storage service from a Java application are:

  • Java Development Kit (JDK)
  • Windows Azure Libraries for Java
  • Windows Azure SDK

Create a Java Application to Access Table Storage

We add the following import statements to the Java classes that we use to access Table Storage.

// Following imports are required to use table storage APIs

import com.microsoft.windowsazure.services.core.storage.*;

import com.microsoft.windowsazure.services.table.client.*;

import com.microsoft.windowsazure.services.table.client.TableQuery.*;

Retrieving a Storage Account

To retrieve a storage account, initialize an object of the CloudStorageAccount class. The initialized object represents the storage account. We can initialize CloudStorageAccount using a Windows Azure Storage account or a local storage account (Storage Emulator account).

Retrieving a Windows Azure Storage Account

We first need to retrieve Windows Azure Storage account using the CloudStorageAccount class. The storage account can be retrieved by parsing the connection string using the CloudStorageAccount.parse method. The connection string consists of the default endpoint protocol, storage account name, and storage account key.

Here is the sample code of retrieving storage account.

publicstaticfinal String storageConnectionString =

“DefaultEndpointsProtocol=https;” +

“AccountName=<storage_account_name>;” +

“AccountKey=<storage_account_key>”;

CloudStorageAccount storageAccount =

CloudStorageAccount.parse(storageConnectionString);

In this code, the storage account is specified as AccountName and the primary access key of the storage account is specified as AccountKey. The primary access key is listed in the Windows Azure management portal.

Working with Storage Account in Local Emulator

Windows Azure SDK provides a Storage Emulator that emulates Windows Azure Storage, and is backed by a local SQL Server instance (SQL Express, by default). While the storage emulator is fine for development, it differs from Windows Azure Storage. Please see this MSDN article for details about specific differences.

The code below retrieves the emulated storage account. Before running the following code, ensure that Storage Emulator is up and running.

CloudStorageAccount storageAccount =

CloudStorageAccount.getDevelopmentStorageAccount();

 While developing an application the CloudStorageAccount.getDevelopmentStorageAccount method can be used to access the emulated storage account. This is particularly useful if the developer is not having access to the Windows Azure Storage account. However, you should not use this method in code that you deploy to Windows Azure, because the development storage account is not available in Windows Azure.

An alternative approach to accessing the local emulator storage account is to access it just like you would access a real storage account, with a storage account name and key in your configuration file. The emulator account has a special account name and key:

  • Account name: devstoreaccount1
  • Account key: Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==

 You can place these in the local configuration file, and place your real credentials in the cloud configuration file, allowing you to easily run code against either account without changing any code.

 

Development storage account details are documented in this MSDN article.

Performing Operations on Table Storage from a Java Application

To access Table Storage, a table client is required. We use the CloudTableClient class to get the reference to tables. We initialize tableClient, the object of CloudTableClient, using the CloudStorageAccount class. Here is the sample code of creating a table client.

// Create the table client

CloudTableClient tableClient =

storageAccount.createCloudTableClient();

Once a table client is created, we can perform various operations on Table Storage using the table client.

How to Create a Table

Use the createTableIfNotExist method on CloudTableClient to create a table. This method first checks whether a table with the specified name exists in Windows Azure Storage and then creates a table only if the table with specified name does not exist.

// Create the table if it doesn’t exist

String tableName = “Employee”;

tableClient.createTableIfNotExists(tableName);

Table Naming Conventions

Table names are alphanumeric and must be unique within a storage account. Full naming guidelines are specified in the MSDN article, “Understanding the Table Service Data Model.”

How to Insert an Entity

The entities are mapped to Java objects using a custom class that extends the TableServiceEntity class. To add an entity to a table, first create a class that extends TableServiceEntity and defines the properties of the entity.

The following code creates an entity class for employee with department name as the partition key, id as the row key, and first name, last name as other properties. The partition key and the row key together uniquely identify an entity in the table.

publicclass EmployeeEntity extends TableServiceEntity {

public EmployeeEntity(String departmentName, String id) {

this.partitionKey = departmentName;

this.rowKey = id;

}

public EmployeeEntity() {

}

private String firstName;

private String lastName;

public String getFirstName() {

returnfirstName;

}

publicvoid setFirstName(String firstName) {

this.firstName = firstName;

}

public String getLastName() {

returnlastName;

}

publicvoid setLastName(String lastName) {

this.lastName = lastName;

}

}

After creating a class for the entity, we insert an entity into the table. We need an object of TableOperation to perform any operation on the table. An operation is defined in this object.

To insert an entity, we first initialize the object of the EmployeeEntity class that we created. Then, we create an object of TableOperation and define the insert operation using the TableOperation.insert method. Finally, we specify the table name and the operation object as parameters for the execute method and call the execute method on CloudTableClient.

String tableName = “Employee”;

// Create a new entity for employee

EmployeeEntity employeeEntity = new EmployeeEntity(“HR”, “999″);

employeeEntity.setFirstName(“Steve”);

employeeEntity.setLastName(“James”);

// Create a table operation to insert employee entity

// into Employee table

TableOperation insertEmployee = TableOperation.insert(employeeEntity);

// Call execute method on table client

// so as to perform the operation

tableClient.execute(tableName, insertEmployee);

How to Delete an Entity

To delete an entity, first retrieve the entity. This can be done by performing the retrieve table operation. We first create the retrieve table operation by calling the TableOperation.retrieve method with the partition key and the row key of the entity as parameters. We then specify the table name and the retrieve operation object as parameters for the execute method and call the execute method on CloudTableClient.

After retrieving the entity, we create an operation to delete the entity by calling the TableOperation.delete method. The last step is to call execute on CloudTableClient with the delete operation.

String tableName = “Employee”;

// Create operation to retrieve an employee with partition key as “HR”

// and row key as “999″

TableOperation findEmployee =

TableOperation.retrieve(“HR”, “999″, EmployeeEntity.class);

// Call execute method on table client

// so as to perform the retrieve operation for employee

// with partition key as “HR” and row key as “999″

EmployeeEntity employeeEntity =

tableClient.execute(tableName, findEmployee).getResultAsType();

// Create operation to delete the entity

TableOperation deleteEmployee = TableOperation.delete(employeeEntity);

// Call execute method on table client

// so as to perform the delete operation

tableClient.execute(tableName, deleteEmployee);

How to Retrieve a Single Entity

As explained earlier, we first create the retrieve table operation by calling the TableOperation.retrieve method with the partition key and the row key of the entity as parameters. We then specify the table name and the retrieve operation object as the parameters for the execute method and call the execute method on CloudTableClient. After getting the result of execute, the method getResultAsType is used to cast the result to the type of assignment that is EmployeeEntity.

If the partition key and the row key that we specified for TableOperation.retrieve don’t exist in the table, the execute method returns a null value.

String tableName = “Employee”;

// Create operation to retrieve an employee with partition key as “Finance”

// and row key as “111″

TableOperation findEmployee =

TableOperation.retrieve(“Finance”, “111″, EmployeeEntity.class);

// Call execute method on table client

// so as to perform the retrieve operation for employee

// with partition key as ” Finance” and row key as “111″

EmployeeEntity employeeEntity =

tableClient.execute(tableName, findEmployee).getResultAsType();

How to Retrieve a Set of Entities

To query a table for entities in a partition, the TableQuery classis used. Filters can be used to restrict the query results by specifying conditions. Filters on queries can be created using the TableQuery.generateFilterCondition method.

We first create filters for the query with the partition key and the row key as the filter conditions. We then combine both the filters using the TableQuery.combineFilters method.

To query a table after creating the filters, we use the TableQuery.from method withthetable name and EmployeeEntity as parameters of the method. The filter expression is specified by calling the where method on that table query object. The query is finally invoked by calling the execute method on CloudTableClient. The result of the execute method is an iterator that can be used to consume the results.

String tableName = “Employee”;

// Create a filter condition where partition key is “Finance”

String partitionFilter = TableQuery.generateFilterCondition(

TableConstants.PARTITION_KEY,

QueryComparisons.EQUAL,

“Finance”);

// Create a filter condition where row key is less than 555

String rowFilter = TableQuery.generateFilterCondition(

TableConstants.ROW_KEY,

QueryComparisons.LESS_THAN,

“555″);

// Combine both filters with AND operator

String filter = TableQuery.combineFilters(

partitionFilter, Operators.AND, rowFilter);

// Create a table query by specifying the table name,

// EmployeeEntity as entity and the filter expression

// being the AND combination of the above filters

TableQuery<EmployeeEntity> query = TableQuery.from(

tableName, EmployeeEntity.class).where(filter);

// Iterate over the results

for (EmployeeEntity employeeEntity : tableClient.execute(query)) {

// process each entity

}

How to Insert Multiple Entities Transactionally

Windows Azure Table Storage has Entity Group Transactions, which are transactions executed within a single partition. When writing multiple entities in this manner, they are all written in one atomic transaction. Note: Entity group transactions only work within the same partition, not across different partitions. Entity group transactions are limited to 100 entities and 4MB total.

To perform entity group operations, TableBatchOperation is used. The following code inserts two entities into a table in a single transaction.

String tableName = “Employee”;

// Create a batch operation

TableBatchOperation batchOperation = new TableBatchOperation();

// Create employee entity to insert into table

EmployeeEntity employeeEntity = new EmployeeEntity(“HR”, “666″);

employeeEntity.setFirstName(“Betty”);

employeeEntity.setLastName(“Baines”);

// Add the entity to batch operation

batchOperation.insert(employeeEntity);

// Create another employee entity to insert into table

EmployeeEntity employeeEntity1 = new EmployeeEntity(“HR”, “777″);

employeeEntity1.setFirstName(“Jordan”);

employeeEntity1.setLastName(“Lee”);

// Add the entity to batch operation

batchOperation.insert(employeeEntity1);

// Call execute method on table client

// so as to perform the batch operation

tableClient.execute(tableName, batchOperation);

Using the above code, we first create an object of TableBatchOperation and then create employee entities, set their properties, and associate them with the batch operation by calling the TableBatchOperation.insert method. After all this, we call the execute method on CloudTableClient with the table name and the batch operation object as the parameters to perform the batch operation.

How to Update an Entity

To update an entity, retrieve the entity from the table, make changes to the entity, and then save the entity using replace or merge operation. Replace replaces the entire entity with new entity while merge updates an existing entity by updating the entity’s properties.

The following code performs the merge operation. It first retrieves the entity using the TableOperation.retrieve method. Then, it modifies the last name of the entity and creates a merge operation by calling TableOperation.merge with the modified entity as the parameter of the method. It finally calls the execute method on CloudTableClient object and updates the entity.

String tableName = “Employee”;

// Create operation to retrieve an employee with partition key as “Finance”

// and row key as “222″

TableOperation retrieveEntity =

TableOperation.retrieve(“Finance”, “222″, EmployeeEntity.class);

// Call execute method on table client

// so as to perform the retrieve operation for employee

// with partition key as “Finance” and row key as “222″

EmployeeEntity employeeEntity =

tableClient.execute(tableName, retrieveEntity).getResultAsType();

// Update the last name

employeeEntity.setLastName(“Rogers”);

// Create table operation to merge entity

TableOperation mergeEntity =

TableOperation.merge(employeeEntity);

// Call execute method on table client

// so as to perform the merge operation

tableClient.execute(tableName, mergeEntity);

How to Insert or Merge an Entity

The TableOperation.insertOrMerge method inserts an entity if the entity does not exist. If the entity exists, the modified properties of the entity are merged with the existing properties of the entity.

To demonstrate the insert or merge operation, we create a new employee entity and table operation to insert or merge the created entity by calling TableOperation.insertOrMerge. We then call execute on CloudTableClient with the table name and the insert or merge operation as parameters of the execute method.

String tableName = “Employee”;

// Create new employee entity

EmployeeEntity employeeEntity = new EmployeeEntity(“Finance”, “222″);

employeeEntity.setFirstName(“Ryan”);

employeeEntity.setLastName(“Mclaren”);

// Create a table operation to insert or merge entity

TableOperation insertOrMergeEntity =

TableOperation.insertOrMerge(employeeEntity);

// Call execute method on table client

// so as to perform the insert or merge operation

tableClient.execute(tableName, insertOrMergeEntity);

How to Insert or Replace an Entity

The TableOperation.insertOrReplace method inserts an entity if the entity does not exist. If the entity already exists, the existing entity is replaced with the new entity.

To insert or replace an entity, create a new employee entity and a table operation to insert or replace the entity by calling TableOperation.insertOrReplace. Then, call execute on CloudTableClient with the table name and the insert or replace operation as parameters of the execute method.

String tableName = “Employee”;

// Create new employee entity

EmployeeEntity employeeEntity = new EmployeeEntity(“HR”, “666″);

employeeEntity.setFirstName(“Shaun”);

employeeEntity.setLastName(“Adams”);

// Create a table operation to insert or replace entity

TableOperation insertOrReplaceEntity =

TableOperation.insertOrReplace(employeeEntity);

// Call execute method on table client

// so as to perform the insert or replace operation

tableClient.execute(tableName, insertOrReplaceEntity);

How to Delete a Table

To delete a table, make use of the deleteTableIfExists method on CloudTableClient. This method deletes a table.

String tableName = “Employee”;

// Delete the table if it exists

tableClient.deleteTableIfExists(tableName);

Summary

In this article, we discussed the advantages of using Windows Azure Table Storage over the relational database table. We have also demonstrated various operations that can be performed on Table Storage.

Each Windows Azure table contains the partition key and row key attributes. The partition key is a unique identifier for the partition within the table and facilitates a logical grouping in the table. The row key is a unique identifier for an entity within a partition. The combination of the partition key and row key forms a unique identifier for the table. When we query using table without specifying row key, the storage service performs full table scan by searching across an entire partition (even if another property is specified in the filter). Likewise, if no partition key is specified, the search is performed across all the partitions. This might have adverse impact on performance. Fortunately, Windows Azure tables support indexing on partition keys and row keys. We can directly query a table based on the partition key and row key combination. The value of the partition key and row key should be derived from the data in the table. This will search for the partition first and then for the row key within the partition.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>