While Hadoop’s job tracker provides detailed information about jobs that have been run in the
cluster, it is not a persistent data store for such information.
Kiji tracks all historical jobs in a
job_history table within every instance.
This information includes an xml dump of the full job configuration, start times,
end times, and all job counters.
job_history table is installed in a particular instance as soon as the first MapReduce job is run in that instance.
You can verify that the table was installed properly using the ls command:
kiji ls kiji://.env/default/job_history
Jobs that extend
KijiMapReduceJob will automatically record metadata to the
For more information on Kiji security, see the KijiSchema userguide. If you have a secure Kiji instance, KijiMR should "just work", except that users without WRITE permissions on the instance will not have their jobs recorded in the Job History Table, and you will see a non-fatal error even if the job ran successfully. For example, users with only READ permissions on the instance will be able to run Gatherers, but those jobs will not be recorded.
You can grant WRITE permissions on an instance, if you have GRANT permission, as follows:
kiji-schema-shell schema > MODULE security; schema > GRANT WRITE PRIVILEGES ON INSTANCE 'kiji://myzk:2181/myinstance' TO USER 'ada'; OK.
JobHistoryKijiTable class is the main class responsible for providing access to
job_history table. Currently it provides the ability to record and retrieve job metadata. This
is a framework-audience class and subject to change between minor versions.
Using the API
JobHistoryKijiTable class surfaces the calls
getJobDetails(String jobId) and
getJobScanner() for retrieving the recorded metadata.
job_history table is a Kiji table under the hood, and can thus be inspected using the
kiji scan, and
kiji get tools. The
EntityId associated with the
job_history table is the jobId. For example, to look at all of the jobIds that have been recorded:
kiji scan kiji://.env/default/job_history/info:jobId
There is also a
kiji job_history tool, which displays the job history data in a more human readable
kiji job-history --kiji=kiji://.env/default/
To look up the job data for an individual job with jobId ‘job_20130221123621875_0001’, try:
kiji job-history --kiji=kiji://.env/default --job-id=job_20130221123621875_0001
KijiMR User Guide
- What is KijiMR?
- Bulk Importers
- Command Line Tools
- Key-Value Stores
- Job History
- Working with Avro