448 lines
13 KiB
Markdown
448 lines
13 KiB
Markdown
---
|
|
title: Asahina Antenna Metadata Format (HINA) 2.2, rev. 0.12
|
|
date: November 7, 2001
|
|
toc: true
|
|
---
|
|
|
|
> This document is an unofficial English translation of the original Japanese
|
|
> specification made by someone who has no knowledge of Japanese. Implement at
|
|
> your own risk.
|
|
|
|
## Overview
|
|
|
|
This document describes Hina-Di, the metadata format used by [Asahina
|
|
Antenna][antenna]. In this document, "metadata" is defined as data on a webpage
|
|
such as its last update time or its author. Asahina Antenna acts as a feed
|
|
reader for Hina-Di.
|
|
|
|
## Conventions used in this document
|
|
|
|
This document uses the Backus-Naur notation ([RFC 822][rfc822]) to formally
|
|
specify the format.
|
|
|
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
|
|
"SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this
|
|
document are to be interpreted as described in BCP 14 ([RFC 2119][rfc2119],
|
|
[RFC 8174][rfc8174]) when, and only when, they appear in all capitals, as shown
|
|
here.
|
|
|
|
## Data Types
|
|
|
|
The basic data types that constitute Hina-Di are listed below. The US-ASCII
|
|
character set is defined by ANSI X3.4-1986.
|
|
|
|
```
|
|
OCTET = <any 8-bit sequence of data>
|
|
CHAR = <any US-ASCII character (octets 0 - 127)>
|
|
UPALPHA = <any US-ASCII uppercase letter "A".."Z">
|
|
LOALPHA = <any US-ASCII lowercase letter "a".."z">
|
|
ALPHA = UPALPHA | LOALPHA
|
|
DIGIT = <any US-ASCII digit "0".."9">
|
|
WORD = 1*(ALPHA|DIGIT)
|
|
|
|
CTL = <any US-ASCII control character (octets 0 - 31) and DEL (127)>
|
|
CR = <US-ASCII CR, carriage return (13)>
|
|
LF = <US-ASCII LF, linefeed (10)>
|
|
SP = <US-ASCII SP, space (32)>
|
|
HT = <US-ASCII HT, horizontal-tab (9)>
|
|
<"> = <US-ASCII double-quote mark (34)>
|
|
|
|
CRLF = CR LF
|
|
|
|
TEXT = <any OCTET except CTLs, but including HT>
|
|
TOKEN = <any TEXT, but don't start with SP or HT>
|
|
|
|
SEPARATOR = ":" 1*(SP|HT)
|
|
DELIMITER = "," *(SP|HT)
|
|
SLASH = "/" *(SP|HT)
|
|
```
|
|
|
|
## Structure
|
|
|
|
A Hina-Di file consists of a series of blocks that summarize the metadata on a
|
|
website: a header block, followed by one or more entity blocks.
|
|
|
|
```
|
|
hina-di = header-block
|
|
1*( entity-block )
|
|
```
|
|
|
|
### Block
|
|
|
|
A block is a set of metadata for a document. Each metadata is represented as a
|
|
single header, in a manner similar to [RFC 822][rfc822], with a field name and a
|
|
field value.
|
|
|
|
Field names in a block MUST be unique. A block with duplicate field names MUST
|
|
be discarded.
|
|
|
|
Field names are case-sensitive. Field values may be case-sensitive,
|
|
depending on the field.
|
|
|
|
```
|
|
line-format = field-name SEPARATOR field-value CRLF
|
|
field-name = WORD *( "-" WORD)
|
|
field-value = TOKEN
|
|
```
|
|
|
|
### Header block
|
|
|
|
Exactly one header block MUST appear in a Hina-Di file, and it MUST be the
|
|
first block. It holds metadata about the Hina-Di file itself.
|
|
|
|
```
|
|
header-block = HINA
|
|
Hinadi-Header
|
|
CRLF
|
|
Hinadi-Header = 1*( User-Agent
|
|
| Content-Type
|
|
| Date )
|
|
```
|
|
|
|
### Entity block
|
|
|
|
One or more entity blocks MUST be present after the header block. Each entity
|
|
block defines metadata about a specific document.
|
|
|
|
```
|
|
Entity-block = URL ( HINA-Version
|
|
| Virtual
|
|
| Content-Type
|
|
| Date
|
|
| Title
|
|
| Author-Name
|
|
| Expires
|
|
| Expire
|
|
| Last-Modified
|
|
| Last-Modified-Detected
|
|
| Server
|
|
| Authorized
|
|
| Authorized-url
|
|
| Method
|
|
| Keyword
|
|
| Image-Width
|
|
| Image-Height
|
|
| Experimental-field
|
|
| Undefined-field )
|
|
CRLF
|
|
```
|
|
|
|
## Fields
|
|
|
|
This section defines the various fields that may be found in blocks.
|
|
All fields are OPTIONAL and case-insensitive unless otherwise specified.
|
|
|
|
### HINA
|
|
|
|
Indicates that this is a Hina-Di file, and includes its version.
|
|
This field is REQUIRED as the first field of Hina-Di files.
|
|
|
|
```
|
|
HINA = "HINA" "/" hinadi-version CRLF
|
|
hinadi-version = "2.2beta"
|
|
```
|
|
|
|
### User-Agent
|
|
|
|
Name of the user agent that created this Hina-Di file.
|
|
This field is REQUIRED in header blocks.
|
|
The value of this field is case-sensitive.
|
|
|
|
```
|
|
User-Agent = "User-Agent" SEPARATOR TOKEN CRLF
|
|
```
|
|
|
|
### URL
|
|
|
|
URL of the document, compliant with [RFC 2396][rfc2396].
|
|
|
|
This field is REQUIRED in entity blocks.
|
|
Making this field the first field of an entity block is RECOMMENDED.
|
|
|
|
The scheme and domain portions of the URL are not case-sensitive.
|
|
If the other portions of the URL are not case-insensitive, they SHOULD be
|
|
written using lowercase characters.
|
|
|
|
```
|
|
URL = "URL" SEPARATOR rfc2396-url CRLF
|
|
rfc2396-url = <URI described in section 5.1.2 "Request-URI" in RFC 2396>
|
|
```
|
|
|
|
Implementations can use this field as a unique key that distinguishes the
|
|
entity block from other blocks. To ensure proper uniqueness of this field,
|
|
the following conditions MUST be respected by the providing Hina-Di user
|
|
agents or their administrators:
|
|
|
|
* If the URL can end in a slash (`/`), then it SHOULD end in a slash.
|
|
Prefer `http://www.hoge.jp/foo/` over `http://www.hoge.jp/foo`
|
|
* If the URL includes a file name, but the file name can be omitted,
|
|
then it SHOULD be omitted.
|
|
Prefer `http://www.hoge.jp/foo/` over `http://www.hoge.jp/foo/index.html`
|
|
|
|
### HINA-Version
|
|
|
|
Specifies that the integrity of the entity block was guaranteed according to
|
|
the specification of a specific Hina-Di version.
|
|
If this field is missing from an entity block, it means the block might be
|
|
incomplete.
|
|
|
|
```
|
|
HINA-Version = "HINA-Version" SEPARATOR version
|
|
version = "HINA" "/" 1*( DIGIT ) "." 1*( DIGIT )
|
|
```
|
|
|
|
### Virtual
|
|
|
|
URL of another Hina-Di file that holds the entity block, compliant with
|
|
[RFC 2396][rfc2396].
|
|
|
|
If there are fields in the entity block other than `Virtual`, then it takes the
|
|
same meaning as the regular `URL` field.
|
|
|
|
The case-sensitivity and URL uniqueness conditions defined for the `URL` field
|
|
MUST be followed for this field.
|
|
|
|
```
|
|
Virtual = "Virtual" SEPARATOR rfc2396-url CRLF
|
|
rfc2396-url = <URI described in section 5.1.2 "Request-URI" in RFC 2396>
|
|
```
|
|
|
|
> Note that the original version of the document defines the `Virtual` feed
|
|
> as `Vitural`.
|
|
|
|
### Content-Type
|
|
|
|
MIME type of the Hina-Di file or the document, as described in
|
|
[RFC 1521][rfc1521].
|
|
The value of this field is case-sensitive to the extent defined by RFC 1521.
|
|
|
|
```
|
|
Content-Type = "Content-Type" SEPARATOR rfc1521-type
|
|
rfc1521-type = <Content-Type as described in RFC 1521>
|
|
```
|
|
|
|
### Date
|
|
|
|
The date and time when the block or the Hina-Di file was generated.
|
|
The dates MUST comply with [RFC 1123][rfc1123], better described in section 3.3
|
|
of [RFC 2616][rfc2616].
|
|
The value of this field is case-sensitive.
|
|
|
|
```
|
|
Date = "Date" SEPARATOR rfc1123-date CRLF
|
|
rfc1123-date = <rfc1123-date described in section 3.3 "Date/Time Formats" in RFC 2616>
|
|
```
|
|
|
|
### Title
|
|
|
|
The title of the document.
|
|
|
|
```
|
|
Title = "Title" SEPARATOR TOKEN CRLF
|
|
```
|
|
|
|
### Author-Name
|
|
|
|
Name of the author of the document.
|
|
The value of this field is case-sensitive.
|
|
|
|
```
|
|
Author-Name = "Author-Name" SEPARATOR TOKEN CRLF
|
|
```
|
|
|
|
### Expires
|
|
|
|
Expiration date for the block. The dates MUST comply with [RFC 1123][rfc1123],
|
|
better described in section 3.3 of [RFC 2616][rfc2616].
|
|
The value of this field is case-sensitive to the extent defined by RFC 2616.
|
|
|
|
```
|
|
Expires = "Expires" SEPARATOR rfc1123-date CRLF
|
|
```
|
|
|
|
### Expire
|
|
|
|
Alias for the `Expires` field, included for backwards compatibility.
|
|
|
|
```
|
|
Expire = "Expire" SEPARATOR rfc1123-date CRLF
|
|
```
|
|
|
|
### Last-Modified
|
|
|
|
Date and time when the document was last updated. The dates MUST comply with
|
|
[RFC 1123][rfc1123], better described in section 3.3 of [RFC 2616][rfc2616].
|
|
The value of this field is case-sensitive to the extent defined by RFC 2616.
|
|
|
|
```
|
|
Last-Modified = "Last-Modified" SEPARATOR rfc1123-date CRLF
|
|
```
|
|
|
|
### Last-Modified-Detected
|
|
|
|
Date and time representing when the user agent retrieved the document's
|
|
metadata. The dates MUST comply with [RFC 1123][rfc1123], better described
|
|
in section 3.3 of [RFC 2616][rfc2616].
|
|
The value of this field is case-sensitive to the extent defined by RFC 2616.
|
|
|
|
```
|
|
Last-Modified-Detected = "Last-Modified-Detected" SEPARATOR rfc1123-date CRLF
|
|
```
|
|
|
|
### Server
|
|
|
|
User agent string of the server used to retrieve the metadata of the document
|
|
described by this entity block.
|
|
|
|
```
|
|
Server = "Server" SEPARATOR TOKEN CRLF
|
|
```
|
|
|
|
### Authorized
|
|
|
|
The user agent that retrieved the metadata of the document described by this
|
|
entity block.
|
|
|
|
```
|
|
Authorized = "Authorized" SEPARATOR TOKEN CRLF WORD
|
|
```
|
|
|
|
### Authorized-url
|
|
|
|
URL of a page describing the user agent referred to in the `Authorized` field,
|
|
compliant with [RFC 2396][rfc2396].
|
|
|
|
The case-sensitivity and URL uniqueness conditions defined for the `URL` field
|
|
MUST be followed for this field.
|
|
|
|
```
|
|
Authorized-url = "Authorized-url" SEPARATOR rfc2396-url CRLF
|
|
```
|
|
|
|
### Method
|
|
|
|
Describes the chain of propagation that this entity block went through.
|
|
|
|
```
|
|
Method = "Method" SEPARATOR method-type *(SLASH method-type) (SLASH result-code)
|
|
method-type = "GET" | "HEAD" | "FILE" | "REMOTE"
|
|
result-code = <URI described on "???????" in RFC 2396>
|
|
```
|
|
|
|
#### Method types
|
|
|
|
GET
|
|
: Metadata retrieved using a HTTP GET request.
|
|
HEAD
|
|
: Metadata retrieved using a HTTP HEAD request.
|
|
FILE
|
|
: Metadata retrieved from a local file's timestamp.
|
|
REMOTE
|
|
: Metadata retrieved from an entity block generated by another agent.
|
|
|
|
#### Example
|
|
|
|
```
|
|
Method: REMOTE/REMOTE/GET/200
|
|
```
|
|
|
|
1. A first user agent retrieved the metadata on the document using a HTTP GET
|
|
and got a 200 response code (`GET/200`).
|
|
2. A second user agent retrieved the first user agent's Hina-Di file, then
|
|
propagated it to its own file (`REMOTE`).
|
|
3. A third user agent retrieved the second user agent's Hina-Di file, then
|
|
propogated it to its own file (`REMOTE`).
|
|
|
|
### Keyword
|
|
|
|
Words that can be used to give an overview of the document described by this
|
|
entity block; tags, categories, etc. The value of this field is case-sensitive.
|
|
|
|
```
|
|
Keyword = "Keyword" SEPARATOR keywords CRLF
|
|
keywords = TOKEN *(SEPARATOR TOKEN)
|
|
```
|
|
|
|
### Image-Width
|
|
|
|
Width of an image described by an entity block, in pixels.
|
|
|
|
This field MUST NOT be used for entity blocks that do not describe images.
|
|
|
|
```
|
|
Image-Width = "Image-Width" SEPARATOR width CRLF
|
|
width = DIGIT
|
|
```
|
|
|
|
### Image-Height
|
|
|
|
Height of an image described by an entity block, in pixels.
|
|
|
|
This field MUST NOT be used for entity blocks that do not describe images.
|
|
|
|
```
|
|
Image-Height = "Image-Height" SEPARATOR width CRLF
|
|
height = DIGIT
|
|
```
|
|
|
|
### Experimental fields
|
|
|
|
Implementations MAY define custom fields with an X- prefix to provide
|
|
additional metadata not covered in this specification. Implementations MUST NOT
|
|
assume that all clients will use each of those fields. Clients that do not
|
|
support any experimental field SHOULD ignore them.
|
|
|
|
Experimental fields MAY include data that is not directly related to metadata
|
|
that the document has, and SHOULD be used shall a field for that purpose be
|
|
created by an implementor.
|
|
|
|
```
|
|
Experimental-field = x-field-name SEPARATOR TOKEN
|
|
x-field-name = "X-" WORD *("-" WORD)
|
|
```
|
|
|
|
### Undefined fields
|
|
|
|
Any field that is not defined in this specification. Implementations that
|
|
encounter such fields and do not support them SHOULD ignore them.
|
|
|
|
```
|
|
Undefined-field = undef-field-name SEPARATOR TOKEN CRLF
|
|
undef-field-name = WORD *("-" WORD)
|
|
```
|
|
|
|
## Encoding
|
|
|
|
The character encoding of the Hina-Di file SHOULD be specified as a parameter
|
|
of the `Content-Type` field of the header block. If it is not specified,
|
|
it defaults to `EUC-JP`.
|
|
|
|
## Propagation
|
|
|
|
In Hina-Di, metadata propagation consists in acquiring metadata from other
|
|
agents, then sharing it as it is in the user agent's own Hina-Di file.
|
|
This can be used for aggregation services or a peer-to-peer network.
|
|
|
|
The `Authorized` and `Authorized-url` fields allow indicating the user agent
|
|
from which the metadata originally came from to help ensure its legitimacy.
|
|
Propagating MUST only be performed if both fields are defined and if the user
|
|
agent is trusted.
|
|
|
|
When propagating, all fields of an entity block defined in this specification,
|
|
with the exception of experimental and undefined fields or of fields with empty
|
|
values, MUST be reproduced without modification.
|
|
Propagating experimental or undefined fields is not guaranteed.
|
|
A header block, or any field that is part of it, MUST NOT be propagated.
|
|
|
|
The `Method` field MUST be processed according to the process described in the
|
|
Method section.
|
|
|
|
[antenna]: http://masshy.fastwave.gr.jp/hina/release/
|
|
[rfc822]: https://tools.ietf.org/html/rfc822
|
|
[rfc1123]: https://tools.ietf.org/html/rfc1123
|
|
[rfc1521]: https://tools.ietf.org/html/rfc1521
|
|
[rfc2119]: https://tools.ietf.org/html/rfc2119
|
|
[rfc2396]: https://tools.ietf.org/html/rfc2396
|
|
[rfc2616]: https://tools.ietf.org/html/rfc2616
|
|
[rfc8174]: https://tools.ietf.org/html/rfc8174
|