Developer+Tools+and+API

Back to Documentation

Want to download BHL data as MODS, RIS, BibTex, or .txt? See our Data Exports This section describes BHL's strategy to achieve //** Goal 2: (Tools and Services) Develop services and tools which facilitate discovery and improve research efficiency of BHL content. **// BHL builds APIs to allow individual users and data providers to remix and reuse BHL content. The following APIs are currently available. To suggest an API or enhancement, please use [|contact us].

toc =Data Licensing= include component="page" wikiName="biodivlib" page="Data Licensing" editable="1" wrap="1"

=APIs= The BHL Application Programming Interface (API) is a set of REST-like web services that can be invoked via HTTP queries (GET/POST requests) or SOAP. Responses can be received in one of three formats: JSON, XML, or XML wrapped in a SOAP envelope.

> The documentation for the latest version of the API can be found at []. The first version of the API was limited to data related to scientific names found in the BHL collection; version 2 adds access to title, author, volume, and page information. Please note that users are required to obtain an API Key from [] in order to use version 2 of the API. This is the preferred version of the API. Version 1 (formerly the BHL Name Services): Updated documentation for the first version of the API can be found at []. //This version of the API is provided solely to maintain backwards compatibility//.
 * **Version 2**:

> BHL provides access to its content via an OpenURL Resolver, as documented and described here: > [] > BHL's OpenURL Resolver is a popular tool used by biodiversity databases for linking into citations and exact pages of scanned materials. > > Data providers can also include links to literature using our stable URLs for scanned pages. The URL is displayed below "Link to this page". For example, to cite the original description of //Zea mays//: > Citation: Carl Linnaeus' //Species Plantarum//. 2 : 971. 1753. > []
 * **Citation Linking:**

Also, check these **API wrappers** provided by colleagues from the Open Source community: > A Ruby wrapper of BHL API version 2.5.x. functionality to make it available as a gem contributed by Matt Yoder et al.: >> @https://github.com/SpeciesFileGroup/rubyBHL > An R interface to the BHL API contributed by Scott Chamberlain and Karthik Ram through the rOpenSci project (http://ropensci.org/): >> @https://github.com/ropensci/rbhl

=Data Exports= BHL provides custom data exports, as well as exports that easily integrate with reference management applications like EndNote, Zotero or Mendeley. Please review our Data Exports page for more information.

=Scientific Names= The Biodiversity Heritage Library uses taxonomic intelligence tools, including Global Names Recognition and Discovery (GNRD) developed by Global Names Architecture, to locate, verify, and record scientific names located within the text of each digitized page. The Note: The text used for this identification is uncorrected OCR, so may not include all results expected or visible in the page This names-based index is an incredibly valuable tool for organismal research, and is easily incorporated into external web sites through two different methods of access.

=Bibliography by URL= To easily link into a list of all pages containing a given scientific name, use the following URL: http://www.biodiversitylibrary.org/name///Scientific_name// Where //Scientific_name// is any uninomial, binomial, or trinomial. Replace spaces with the underscore ( _ )character. Examples:
 * [] (Orchid family)
 * [] (Great white shark)
 * [] (Great Cormorant)

=Image URLs= To retrieve a static JPG image from any page in BHL, use the following: http://www.biodiversitylibrary.org/pagethumb/pageid,width,height. For example, []. If width and height values are not specified, the image returned will default to 200W and 300H. Note that because of how the Internet Archive's image servers fulfill requests, the actual images returned will be close to the specified dimensions, but not exact.

To retrieve a full-size, full-resolution JPG image of any BHL page, use the following: http://www.biodiversitylibrary.org/pageimage/pageid. For example, [].

=Stable URLs= BHL produces stable URLs for our content and will ensure viability of these URLs. Please [|read the following blog post] for an explanation of how BHL redirects certain IDs when a book has been taken offline. Stable URLs are available for the following areas of content, with examples: Subject: Insects [] Author: Darwin, Charles, (1809 - 1882) [] Title: //The Journal of the Linnean Society// [] Item/Book: //The Journal of the Linnean Society//, v. 8 1865 [] Page/Article: Bentham, G. (1865). On the Genera //Sweetia//, Sprengle, and //Glycine//, Linn., simultaneously published under the name of //Leptolobium//.//The Journal of the Linnean Society, 8//: 259-267. []

=OAI-PMH= Metadata about the books and journals in the BHL collection is published via OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting). OAI-PMH is a protocol used for publishing and harvesting metadata descriptions of records in an archive. More information about the protocol can be found at @http://www.openarchives.org/pmh/. Descriptive metadata is provided better as MODS (@http://www.loc.gov/standards/mods/v3/mods-3-0.xsd), but also as Dublin Core (@http://www.openarchives.org/OAI/2.0/oai_dc.xsd) and OLEF. OLEF is a format defined to facilitate metadata harmonization among BHL Partners (see @http://www.bhle.eu/bhl-schema/v1/ to find out more about the schema and also review this presentation).

The OAI-PMH endpoint for BHL is http://www.biodiversitylibrary.org/oai

We provide 5 sets in BHL:
 * 1) item
 * 2) title
 * 3) part
 * 4) itemexternal
 * 5) partexternal

1) Item=This set contains individual volumes hosted by BHL. The content is viewable in BHL.

2) Title=This set contains the monographs and journals represented in BHL.

3) Part=This set contains articles/chapters/treatments/etc hosted by BHL. The content is viewable in BHL.

4) Item External=This set contains individual volumes not hosted by BHL. The content must be viewed on a site not maintained by BHL.

5) Part External=This set contains articles/chapters/treatments/etc not hosted by BHL. The content must be viewed on a site not maintained by BHL.

Most aggregators of BHL content will harvest either Item and Part sets or Title and Part sets but not all three. Whether or not an aggregator chooses the Item or Title set will depend upon the level at which their repository catalogs.

If an aggregator does not want to harvest external content (i.e. content that is not hosted within the BHL repository e.g.[|http://www.biodiversitylibrary.org/bibliography/73220#/summary]) then they should not harvest the itemexternal and partexternal sets.

Some example OAI-PMH operations are:
 * http://www.biodiversitylibrary.org/oai?verb=Identify
 * http://www.biodiversitylibrary.org/oai?verb=ListMetadataFormats
 * http://www.biodiversitylibrary.org/oai?verb=ListSets
 * http://www.biodiversitylibrary.org/oai?verb=ListIdentifiers&metadataPrefix=oai_dc&set=title&from=2009-02-01&until=2009-02-04
 * http://www.biodiversitylibrary.org/oai?verb=ListRecords&metadataPrefix=oai_dc&set=title&from=2009-02-01&until=2009-02-04
 * http://www.biodiversitylibrary.org/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:biodiversitylibrary.org:title/2
 * []

=Code and Documentation= Available in Github https://github.com/gbhl/bhl-us

=R Interface to BHL API, via rOpenSci= []

=Macaw Software= @https://github.com/cajunjoel/macaw-book-metadata-tool

=Uploading to Internet Archive= BHL has written instructions on how to upload scanned books to the Internet Archive.

include page="include_pagefooter"