The KijiTableReader class provides a get(...) method to read typed data from a Kiji table row. The row is addressed by its EntityId (which can be retrieved from the KijiTable instance using the getEntityId() method). Specify the desired cells from the rows with a KijiDataRequest. See the KijiDataRequest documentation for details.

In general, Kiji and KijiTable instances should only be opened once over the life of an application (EntityIdFactorys should also be reused). KijiTablePool can be used to maintain a pool of opened KijiTable objects for reuse. To initially open a KijiTable:

// URI for Kiji instance « kiji_instance_name » in your default HBase instance:
final KijiURI kijiURI = KijiURI.newBuilder().withInstanceName("kiji_instance_name");
final Kiji kiji = Kiji.Factory.open(kijiURI);
try {
  final KijiTable table = kiji.openTable("table_name");
  try {
    // Use the opened table:
    // …
  } finally {
    // Always close the table you open:
    table.close();
  }
} finally {
  // Always release the Kiji instances you open:
  kiji.release();
}

To read from an existing KijiTable, create a KijiDataRequest specifying the columns of data to return. Then, query for the desired EntityId, using a KijiTableReader. You can get a KijiTableReader for a KijiTable using the openTableReader() method.

For example:

final KijiTableReader reader = table.openTableReader();
try {
  // Select which columns you want to read:
  final KijiDataRequest dataRequest = KijiDataRequest.builder()
      .addColumns(ColumnsDef.create().add("some_family", "some_qualifier"))
      .build();
  final EntityId entityId = table.getEntityId("your-row");
  final KijiRowData rowData = reader.get(entityId, dataRequest);
  // Use the row:
  // …
} finally {
  // Always close the reader you open:
  reader.close();
}

The KijiTableReader also implements a bulkGet(...) method for retrieving data for a list of EntityIds. This is more efficient than a series of calls to get(...) because it uses a single RPC instead of one for each get.

Row scanners

If you need to process a range of row, you may use a row KijiRowScanner:

final KijiTableReader reader = table.openTableReader();
try {
  final KijiDataRequest dataRequest = KijiDataRequest.builder()
      .addColumns(ColumnsDef.create().add("family", "qualifier"))
      .build();
  final KijiScannerOptions scanOptions = new KijiScannerOptions()
      .setStartRow(table.getEntityId("the-start-row"))
      .setStopRow(table.getEntityId("the-stop-row"));
  final KijiRowScanner scanner = reader.getScanner(dataRequest, scanOptions);
  try {
    // Scan over the requested row range, in order:
    for (KijiRowData row : scanner) {
      // Process the row:
      // …
    }
  } finally {
    // Always close scanners:
    scanner.close();
  }
} finally {
  // Always close table readers:
  reader.close();
}

Modifying Data

The KijiTableWriter class provides a put(...) method to write or update cells in a Kiji table. The cell is addressed by its entity ID, column family, column qualifier, and timestamp. You can get a KijiTableWriter for a KijiTable using the openTableWriter() method.

final KijiTableWriter writer = table.openTableWriter();
try {
  // Write a string cell named "a_family:some_qualifier" to the row "the-row":
  final long timestamp = System.currentTimeMillis();
  final EntityId eid = table.getEntityId("the-row");
  writer.put(eid, "a_family", "some_qualifier", timestamp, "Some value!");
  writer.flush();
} finally {
  // Always close the writers you open:
  writer.close();
}

Note: the type of the value being written to the cell must match the type of the column declared in the table layout.

Counters

Incrementing a counter value stored in a Kiji cell would normally require a “read-modify-write” transaction using a client-side row lock. Since row locks can cause contention, Kiji exposes a feature of HBase to do this more efficiently by pushing the work to the server side. To increment a counter value in a Kiji cell, the column must be declared with a schema of type “counter”. See Managing Data for details on how to declare a counter in your table layout.

Columns containing counters may be accessed like other columns; counters are exposed as long integers. In particular, the counter value may be retrieved using KijiTableReader.get(...) and written using KijiTableWriter.put(...). In addition to that, the KijiTableWriter class also provides a method to atomically increment counter values.

final KijiTableWriter writer = table.openTableWriter();
try {
  // Incrementing the counter type column "a_family:some_counter_qualifier" by 2:
  final EntityId eid = table.getEntityId("the-row");
  writer.increment(eid, "a_family", "some_counter_qualifier", 2);
  writer.flush();
} finally {
  // Always close the writer you open:
  writer.close();
}

MapReduce

 

Deprecation Warning

This section refers to classes in the org.kiji.schema.mapreduce package that may be removed in the future. Please see the KijiMR Userguide for information on using MapReduce with Kiji.

The KijiTableInputFormat provides the necessary functionality to read from a Kiji table in a MapReduce job. To configure a job to read from a Kiji table, use KijiTableInputFormat’s static setOptions method. For example:

Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf);

// * Setup jars to ship to the hadoop cluster.
job.setJarByClass(YourClassHere.class);
GenericTableMapReduceUtil.addAllDependencyJars(job);
DistributedCacheJars.addJarsToDistributedCache(job,
    new File(System.getenv("KIJI_HOME"), "lib"));
job.setUserClassesTakesPrecedence(true);
// *

KijiDataRequest request = new KijiDataRequest()
    .addColumn(new KijiDataRequest.Column("your-family", "your-qualifier"));

// Setup the InputFormat.
KijiTableInputFormat.setOptions(job, "your-kiji-instance-name", "the-table-name", request);
job.setInputFormatClass(KijiTableInputFormat.class);

The code contained within “// *” is responsible for shipping Kiji resources to the DistributedCache. This is so that all nodes within your hadoop cluster will have access to Kiji dependencies.

KijiTableInputFormat outputs keys of type EntityId and values of type KijiRowData. This data can be accessed from within a mapper:

@Override
public void map(EntityId entityId, KijiRowData row, Context context) {
  // ...
}

To write to a Kiji table from a MapReduce job, you should use KijiTableWriter as before. You should also set your OutputFormat class to NullOutputFormat, so MapReduce doesn’t expect to create a directory full of text files on your behalf.

To configure a job to write to a Kiji table, refer to the following example:

Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf);

// Setup jars to ship to the hadoop cluster.
job.setJarByClass(YourClassHere.class);
GenericTableMapReduceUtil.addAllDependencyJars(job);
DistributedCacheJars.addJarsToDistributedCache(job,
    new File(System.getenv("KIJI_HOME"), "lib"));
job.setUserClassesTakesPrecedence(true);

// Setup the OutputFormat.
job.setOutputKeyClass(NullWritable.class);
job.setOutputValueClass(NullWritable.class);
job.setOutputFormatClass(NullOutputFormat.class);

And then, from within a Mapper:

public class MyMapper extends Mapper<LongWritable, Text, NullWritable, KijiOutput> {
  private KijiTableWriter writer;
  private Kiji kiji;
  private KijiTable table;

  @Override
  public void setup(Context context) {
    // Open a KijiTable for generating EntityIds.
    kiji = Kiji.open("your-kiji-instance-name");
    table = kiji.openTable("the-table-name");

    // Create a KijiTableWriter that writes to a MapReduce context.
    writer = table.openTableWriter();
  }

  @Override
  public void map(LongWritable key, Text value, Context context) {
    // ...

    writer.put(table.getEntityId("your-row"), "your-family", "your-qualifier", value.toString());
  }

  @Override
  public void cleanup(Context context) {
    writer.close();
    kiji.close();
    table.close();
  }
}