Add the Nutritional Database API Guide
This commit is contained in:
parent
7bf66d06fc
commit
a3ac976c13
|
@ -0,0 +1,442 @@
|
||||||
|
NDB API Guide
|
||||||
|
=============
|
||||||
|
|
||||||
|
This guide aims to provide a better help than what the USDA gives, by
|
||||||
|
describing both all of the available data and the ways to access it.
|
||||||
|
|
||||||
|
.. contents::
|
||||||
|
:local:
|
||||||
|
:backlinks: none
|
||||||
|
|
||||||
|
The Database
|
||||||
|
------------
|
||||||
|
|
||||||
|
The Nutritional Database is split in two parts:
|
||||||
|
|
||||||
|
* The **Standard Release** database, or **SR**: It holds nutritional
|
||||||
|
information for common foods with no associated brands; useful to answer
|
||||||
|
requests like "regular oatmeal". This part of the database is released
|
||||||
|
yearly in multiple formats, including an Access Database.
|
||||||
|
* The **Branded Foods** database: Holds nutritional information for branded
|
||||||
|
food items from US manufacturers; useful to answer more specific requests
|
||||||
|
like "McFlurry with Oreo cookies".
|
||||||
|
|
||||||
|
There are a few Python packages to provide ways to make use of the Standard
|
||||||
|
Release database, but they only work with the yearly exports as a starting
|
||||||
|
point; not with the API. Furthermore, the API provides access to both
|
||||||
|
databases, while the yearly exports only include the Standard Release.
|
||||||
|
This is why *python-usda* was made.
|
||||||
|
|
||||||
|
Basic items
|
||||||
|
-----------
|
||||||
|
|
||||||
|
These items can be accessed using list endpoints. They provide the basics to
|
||||||
|
later access nutritional information.
|
||||||
|
|
||||||
|
Food items
|
||||||
|
^^^^^^^^^^
|
||||||
|
|
||||||
|
One of the simplest items. A food item has an ID, also called a ``ndbno``
|
||||||
|
(a Nutritional Database number), and a name. A search endpoint is available to
|
||||||
|
search food items by name.
|
||||||
|
|
||||||
|
Food groups
|
||||||
|
^^^^^^^^^^^
|
||||||
|
|
||||||
|
Food items may belong in food groups; requesting for a food item will only
|
||||||
|
give you the food group's name, but it is possible to list food groups
|
||||||
|
themselves and get an ID linked to their name.
|
||||||
|
|
||||||
|
Nutrients
|
||||||
|
^^^^^^^^^
|
||||||
|
|
||||||
|
Nutrients also can be listed, and only have an ID and a name; list endpoints
|
||||||
|
only provide you with IDs and names. However, they can hold measurement data
|
||||||
|
when they are returned inside a report.
|
||||||
|
|
||||||
|
Derivation codes
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Those codes can be listed and provide information as of how a nutrient's
|
||||||
|
measured value has been derived from multiple measurements. This information
|
||||||
|
is not fully supported by *python-usda* but can still be obtained when
|
||||||
|
requesting a report in the ``Statistics`` mode as raw JSON.
|
||||||
|
Nutrients will hold indicator codes that can be linked to descriptions using
|
||||||
|
the list endpoint for derivation codes.
|
||||||
|
|
||||||
|
Reports
|
||||||
|
-------
|
||||||
|
|
||||||
|
To get actual nutritional information, as list endpoints will not give you
|
||||||
|
anything of that sort, you need to ask for a report. There are two types of
|
||||||
|
reports available.
|
||||||
|
|
||||||
|
Food Reports
|
||||||
|
^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Food Reports are what you would find on a product's packaging; all the
|
||||||
|
nutritional facts for a given food item.
|
||||||
|
|
||||||
|
Types
|
||||||
|
'''''
|
||||||
|
|
||||||
|
There are three types of Food Reports that you can request for:
|
||||||
|
|
||||||
|
Basic
|
||||||
|
The most common nutritional information; exactly what you would find on an
|
||||||
|
actual product's packaging.
|
||||||
|
Full
|
||||||
|
Every single available nutrient for this item.
|
||||||
|
Statistics
|
||||||
|
Get more statistics-related information about the nutrient's measurements;
|
||||||
|
their standard error, the way their values have been derived from multiple
|
||||||
|
measurements, etc. This is not fully supported by *python-usda*.
|
||||||
|
|
||||||
|
In *python-usda*, those report types are represented by the
|
||||||
|
:class:`usda.enums.UsdaNdbReportType` enum.
|
||||||
|
|
||||||
|
Measurements
|
||||||
|
''''''''''''
|
||||||
|
|
||||||
|
In each Food Report, you will find a list of nutrients. Those nutrients will
|
||||||
|
not only have an ID and a name, they will also hold a ``value`` and a
|
||||||
|
``unit`` which express the nutrient's quantity in 100 grams of the food item.
|
||||||
|
They also have a ``group`` to let you regroup nutrients in *nutrient groups*;
|
||||||
|
those are different from *food groups* and cannot be listed anywhere else.
|
||||||
|
|
||||||
|
Nutrients will also hold **measures**: their value is their "main measurement"
|
||||||
|
but there can be more than one measurement, usually performed on another
|
||||||
|
volume of the food item or in different conditions.
|
||||||
|
|
||||||
|
Those measurements will have a ``label`` which describes the measurement
|
||||||
|
itself; most of the time, it just states the volume of food used to perform
|
||||||
|
the measurement.
|
||||||
|
|
||||||
|
The official documentation differs from what the API actually returns; what
|
||||||
|
we have is a measured quantity as a decimal value with a missing unit, and a
|
||||||
|
100-gram equivalent for the measurement. *python-usda* tries to handle this
|
||||||
|
misconception simply by abstracting away the problem and using as properties
|
||||||
|
what the API actually says.
|
||||||
|
|
||||||
|
Versions
|
||||||
|
''''''''
|
||||||
|
|
||||||
|
There are two versions of Food Reports:
|
||||||
|
|
||||||
|
* **Version 1** Food Reports provide foot notes as a list of strings that you
|
||||||
|
have to deal with yourself; you cannot link them to any data. It is only
|
||||||
|
possible to request for one Version 1 food report at once.
|
||||||
|
* **Version 2** Food Reports are provided with another endpoint that lets you
|
||||||
|
request up to 25 reports at once, saving some time, and give you footnotes
|
||||||
|
with unique IDs and a new list of Sources that are more easily handled by
|
||||||
|
code.
|
||||||
|
|
||||||
|
Sources
|
||||||
|
'''''''
|
||||||
|
|
||||||
|
Version 2 Food Reports provide a new ``sources`` property; a list of sources,
|
||||||
|
mostly articles, for the measurements returned in the report.
|
||||||
|
|
||||||
|
Sources are mostly designed to hold information about scientific publications:
|
||||||
|
they have an ID, a title, a year of publication, names of the volume and issue
|
||||||
|
they were first published in, and a list of authors as a long string formatted
|
||||||
|
like in a bibliography citation. While this is perfectible, it is already
|
||||||
|
easier to toy with those sources than with raw footnotes.
|
||||||
|
|
||||||
|
Nutrient Reports
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The Nutritional Database API provides another kind of report; the Nutrient
|
||||||
|
Report. They actually use a list endpoint, not a report endpoint, because they
|
||||||
|
return a list of **food items**.
|
||||||
|
|
||||||
|
For up to 20 nutrients, you can fetch pages and pages of food items with
|
||||||
|
associated nutrients and measurements data. This is perfect to get statistics
|
||||||
|
about a great number of food items and a reduced set of nutrients.
|
||||||
|
|
||||||
|
*python-usda* handles nutrient reports by letting you iterate over them
|
||||||
|
seamlessly, without ever caring about those pages and lists. You can then get
|
||||||
|
food items with an added attribute for a nutrients list, that contain the
|
||||||
|
same kind of information you would get in a Food Report.
|
||||||
|
|
||||||
|
API endpoints
|
||||||
|
-------------
|
||||||
|
|
||||||
|
This section goes deeper in detail about the API endpoints themselves and the
|
||||||
|
implementation in *python-usda*, for those who want to understand some of the
|
||||||
|
design choices or use the API themselves without the assistance of this Python
|
||||||
|
API client.
|
||||||
|
|
||||||
|
There are many quirks that are not described in the API documentation and that
|
||||||
|
are important to know to deal with this API properly, as with many other APIs
|
||||||
|
that do not follow standard practices.
|
||||||
|
|
||||||
|
First of all, every endpoint requires you to give an API key as an
|
||||||
|
``?api_key=`` parameter. For basic testing while doing development, you may
|
||||||
|
use the ``DEMO_KEY`` API key; but this key is strongly rate-limited and should
|
||||||
|
not be used in production. Instead, go get a free Data.gov API key. All you
|
||||||
|
need is to have a name, an e-mail address and to
|
||||||
|
`go here <https://api.data.gov/signup/>`_.
|
||||||
|
|
||||||
|
List endpoints
|
||||||
|
^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
There are three list endpoints: ``/list``, ``/search`` and ``/nutrients``.
|
||||||
|
|
||||||
|
``/list``
|
||||||
|
List food items, food groups, nutrients and derivation codes.
|
||||||
|
``/search``
|
||||||
|
Search food items only, by name.
|
||||||
|
``/nutrients``
|
||||||
|
Get a Nutrient Report.
|
||||||
|
|
||||||
|
List parameters
|
||||||
|
'''''''''''''''
|
||||||
|
|
||||||
|
You can perform GET requests on the ``/list`` endpoint with the following
|
||||||
|
parameters:
|
||||||
|
|
||||||
|
``lt``
|
||||||
|
The list type. Defaults to ``f``.
|
||||||
|
|
||||||
|
* ``d`` for derivation codes;
|
||||||
|
* ``f`` for food items;
|
||||||
|
* ``g`` for food groups;
|
||||||
|
* ``n`` for nutrients;
|
||||||
|
* ``nr`` for all nutrients in the Standard Release database;
|
||||||
|
* ``ns`` for nutrients that are not in the Standard Release database,
|
||||||
|
also known as *specialty nutrients*.
|
||||||
|
|
||||||
|
In *python-usda*, this setting is represented by the
|
||||||
|
:class:`usda.enums.UsdaNdbListType` enum.
|
||||||
|
``max``
|
||||||
|
Maximum number of items to return with each page. Defaults to 50.
|
||||||
|
The official documentation states you can get up to 1,500 items at once;
|
||||||
|
however the API actually limits to 500.
|
||||||
|
``offset``
|
||||||
|
Zero-based index of the first item that should be returned.
|
||||||
|
Defaults to 0. You can use this to perform pagination ;
|
||||||
|
if you got a page with the 50 first results, you can get the next pages by
|
||||||
|
setting this parameter to 50, then 100, then 150, etc.
|
||||||
|
``sort``
|
||||||
|
Field to sort items on. ``n`` for name or ``i`` for ID. Defaults to ``n``.
|
||||||
|
``format``
|
||||||
|
The response return format, ``xml`` or ``json``. Defaults to ``json``.
|
||||||
|
Can also be set using the HTTP Accept header on the request.
|
||||||
|
|
||||||
|
Search parameters
|
||||||
|
'''''''''''''''''
|
||||||
|
|
||||||
|
You can perform GET requests on the ``/search`` endpoint with the following
|
||||||
|
parameters:
|
||||||
|
|
||||||
|
``q``
|
||||||
|
The search query. If left empty, the endpoints acts like ``/list``.
|
||||||
|
``ds``
|
||||||
|
A data source to restrict results to. If left empty, nutrients from all
|
||||||
|
data sources are returned. The two exact following strings can be used:
|
||||||
|
|
||||||
|
* ``Standard Reference``
|
||||||
|
* ``Branded Food Products``
|
||||||
|
``fg``
|
||||||
|
A food group ID to restrict results to. If left empty, no filtering on the
|
||||||
|
food group is performed.
|
||||||
|
``max``
|
||||||
|
Maximum number of items to return with each page. Defaults to 50.
|
||||||
|
The official documentation states you can get up to 1,500 items at once;
|
||||||
|
however the API actually limits to 500.
|
||||||
|
``offset``
|
||||||
|
Zero-based index of the first item that should be returned.
|
||||||
|
Defaults to 0. You can use this to perform pagination;
|
||||||
|
if you got a page with the 50 first results, you can get the next pages by
|
||||||
|
setting this parameter to 50, then 100, then 150, etc.
|
||||||
|
``sort``
|
||||||
|
Field to sort items on. ``n`` for name or ``r`` for relevance to the query.
|
||||||
|
Defaults to ``r``.
|
||||||
|
``format``
|
||||||
|
The response return format, ``xml`` or ``json``. Defaults to ``json``.
|
||||||
|
Can also be set using the HTTP Accept header on the request.
|
||||||
|
|
||||||
|
Nutrient Report
|
||||||
|
'''''''''''''''
|
||||||
|
|
||||||
|
You can perform GET requests on the ``/nutrient`` endpoint with the following
|
||||||
|
parameters:
|
||||||
|
|
||||||
|
``nutrients``
|
||||||
|
A list of up to 20 nutrient IDs to use for the nutrient report.
|
||||||
|
``ndbno``
|
||||||
|
Optionally restrict the nutrient report to a single food item by ID.
|
||||||
|
``fg``
|
||||||
|
A list of up to 10 food group IDs to restrict results to.
|
||||||
|
If left empty, no filtering on the food group is performed.
|
||||||
|
``subset``
|
||||||
|
Boolean: set this to ``1`` to restrict to an abridged list of about 1,000
|
||||||
|
most commonly consumed food items in the United States.
|
||||||
|
Defaults to ``0`` — show all results.
|
||||||
|
``max``
|
||||||
|
Maximum number of items to return with each page. Defaults to 50.
|
||||||
|
The official documentation states you can get up to 1,500 items at once;
|
||||||
|
however the API actually limits to 150.
|
||||||
|
``offset``
|
||||||
|
Zero-based index of the first item that should be returned.
|
||||||
|
Defaults to 0. You can use this to perform pagination;
|
||||||
|
if you got a page with the 50 first results, you can get the next pages by
|
||||||
|
setting this parameter to 50, then 100, then 150, etc.
|
||||||
|
``sort``
|
||||||
|
Field to sort items on. ``f`` for food item or ``c`` for nutrient content.
|
||||||
|
Defaults to ``f``.
|
||||||
|
``format``
|
||||||
|
The response return format, ``xml`` or ``json``. Defaults to ``json``.
|
||||||
|
Can also be set using the HTTP Accept header on the request.
|
||||||
|
|
||||||
|
Responses
|
||||||
|
'''''''''
|
||||||
|
|
||||||
|
List endpoint JSON responses are formatted in the following way:
|
||||||
|
|
||||||
|
.. code:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"list": {
|
||||||
|
"start": "100",
|
||||||
|
"end": "150",
|
||||||
|
"total": "50",
|
||||||
|
"item": [...]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
The ``list.item`` array will hold all the items you requested for.
|
||||||
|
``list.start`` and ``list.end`` are the start and end indexes on this page,
|
||||||
|
and ``list.total`` is the length of the ``list.item`` array, *not* the total
|
||||||
|
number of results. The ``list`` objects will also usually contain other
|
||||||
|
arguments depending on what you have specified in your request, which could
|
||||||
|
make it possible to write a generic parser for any response, entirely
|
||||||
|
detached from any request.
|
||||||
|
|
||||||
|
*python-usda* uses the :class:`usda.pagination.RawPaginator` class to provide
|
||||||
|
seamless iteration over such paginated endpoints.
|
||||||
|
This class returns raw JSON data which can then be parsed using the
|
||||||
|
:class:`usda.Pagination.ModelPaginator` wrapper.
|
||||||
|
|
||||||
|
However, the Nutrient Report endpoint returns responses in the following way:
|
||||||
|
|
||||||
|
.. code:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"report": {
|
||||||
|
"start": "100",
|
||||||
|
"end": "150",
|
||||||
|
"total": "50",
|
||||||
|
"foods": [...]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
For everything else, this endpoint works just like the other list endpoints,
|
||||||
|
but the most important parts of the response, the ``list`` object and its
|
||||||
|
``item`` array, are replaced by ``report`` and ``foods``.
|
||||||
|
|
||||||
|
*python-usda* solves this by using a custom class to paginate over this
|
||||||
|
endpoint: :class:`usda.pagination.RawNutrientReportPaginator`.
|
||||||
|
|
||||||
|
Reports endpoints
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Two endpoints are available for food reports:
|
||||||
|
|
||||||
|
``/reports``
|
||||||
|
Request a single Food Report version 1 at once
|
||||||
|
``/V2/reports``
|
||||||
|
Request up to 25 Food Reports version 2 at once. Version 2 Reports add
|
||||||
|
more data on sources and better footnotes.
|
||||||
|
|
||||||
|
Both endpoints can be requested using the same parameters:
|
||||||
|
|
||||||
|
``ndbno``
|
||||||
|
On Food Reports version 1, ID of a single food item to get a report for.
|
||||||
|
On Food Reports version 2, a list of up to 25 food item IDs to get
|
||||||
|
reports for.
|
||||||
|
``type``
|
||||||
|
The report type. Defaults to ``b``.
|
||||||
|
|
||||||
|
* ``b``: Basic report type; what you could find on an actual product's
|
||||||
|
packaging.
|
||||||
|
* ``f``: Full report type; every nutrient available for the food item.
|
||||||
|
* ``s``: Stats report type; additional statistics information from the
|
||||||
|
Standard Release database.
|
||||||
|
|
||||||
|
In *python-usda*, this parameter is represented by the
|
||||||
|
:class:`usda.enums.UsdaNdbReportType` enum.
|
||||||
|
``format``
|
||||||
|
The response return format, ``xml`` or ``json``. Defaults to ``json``.
|
||||||
|
Can also be set using the HTTP Accept header on the request.
|
||||||
|
|
||||||
|
Errors
|
||||||
|
^^^^^^
|
||||||
|
|
||||||
|
The API returns errors in a very inconsistent way. First of all, a warning:
|
||||||
|
|
||||||
|
.. warning:: Do not trust the HTTP status codes.
|
||||||
|
|
||||||
|
This API often returns HTTP 200 statuses when there actually are errors. The
|
||||||
|
easiest way to handle errors is to first check for a JSON body; if there is
|
||||||
|
one, parse it and see if there is an error or if it is an actual result; if
|
||||||
|
there is none, *then* try checking the status code.
|
||||||
|
|
||||||
|
The error JSON bodies are of multiple shapes depending on the kind of error.
|
||||||
|
What follows is a non-exhaustive list of errors, as it is impossible to make
|
||||||
|
sure all errors are covered without a very thorough usage of the API.
|
||||||
|
|
||||||
|
API rate limit exceeded
|
||||||
|
'''''''''''''''''''''''
|
||||||
|
|
||||||
|
.. code:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"errors": {
|
||||||
|
"error": [
|
||||||
|
{
|
||||||
|
"code": "OVER_RATE_LIMIT",
|
||||||
|
"message": "..."
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
This error is the only known error type where there is an ``errors`` *object*
|
||||||
|
that holds an ``error`` *array*. A developer must have been coding under
|
||||||
|
influence here.
|
||||||
|
|
||||||
|
Invalid API key
|
||||||
|
'''''''''''''''
|
||||||
|
|
||||||
|
.. code:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"code": "API_KEY_INVALID",
|
||||||
|
"message": "..."
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Parameter error
|
||||||
|
'''''''''''''''
|
||||||
|
|
||||||
|
This error occurs when one of the GET parameters in a request is invalid.
|
||||||
|
This may be the most useful error message, as it usually also describes the
|
||||||
|
correct values for the parameter in a way easier to understand than the
|
||||||
|
official documentation.
|
||||||
|
|
||||||
|
Note that in this case, the ``code`` property is a number corresponding to an
|
||||||
|
actual HTTP status code that should be returned as the response's status code,
|
||||||
|
but isn't.
|
||||||
|
|
||||||
|
.. code:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"code": 400,
|
||||||
|
"parameter": "...",
|
||||||
|
"message": "..."
|
||||||
|
}
|
||||||
|
}
|
Reference in New Issue