Fixed other data field example to match specification, replaced 'from' field to 'origin' to avoid keyword collision with SQL

This commit is contained in:
aewens 2020-12-29 18:28:03 +00:00
parent fecc1b1e1c
commit 27946d775a
1 changed files with 9 additions and 9 deletions

View File

@ -67,21 +67,21 @@ The _id_ field will never be transmitted since it is applicable only to the serv
The unique fields for this table are as follows:
- Type: what type of data this entry is, represented as a string up to 64 characters
- From: where the data came from, represented as a string up to 64 characters
- Origin: where the data originated from, represented as a string up to 64 characters
- Data: the raw dump of the data, represented as either an unbounded text or binary blob (transmitted using the latter)
The _type_ field is used as a scope for related entries to make it easier to differentiate from the rest of the entries in this table. Since the primary focus of these entries is for storing data acquired from other services or storing the data a service plans to share with other services, the _from_ field indicates the source of where the data came from (whether this is itself or from an external service). However, a different perspective can also view _from_ field as a way to link data to another entry using its UUID or as a way of assigning ownership to the data (e.g. a username). The _data_ field is the most important field where the bulk of the data will live. While it can be viewed as a bit controversial to store raw data in a SQL table, the motivation for this decision is that with the diversity of data structures from use case to use case ranging so widely any attempt to try and fit the raw data used by service will require deep compromises. So instead, this approach just accepts this fact of life and stores the data in raw binary and leaves the interpretation of the data up to the reader.
The _type_ field is used as a scope for related entries to make it easier to differentiate from the rest of the entries in this table. Since the primary focus of these entries is for storing data acquired from other services or storing the data a service plans to share with other services, the _origin_ field indicates the source of where the data originated from (whether this is itself or from an external service). However, a different perspective can also view _origin_ field as a way to link data to another entry using its UUID or as a way of assigning ownership to the data (e.g. a username). The _data_ field is the most important field where the bulk of the data will live. While it can be viewed as a bit controversial to store raw data in a SQL table, the motivation for this decision is that with the diversity of data structures from use case to use case ranging so widely any attempt to try and fit the raw data used by service will require deep compromises. So instead, this approach just accepts this fact of life and stores the data in raw binary and leaves the interpretation of the data up to the reader.
#### Conventions
The _type_ and _from_ field will provide enough information to determine which decoding should be used on the raw data to extract out the meaningful information for the service wanting to read this data. If the data did not originate from another service, but rather from itself, it should use the name of the service to express this (rather than something like "self", so that no conversions need to be made if the data is ever transmitted). The conventional use case for the _data_ field is to hold a JSON encoding of the raw data. Almost every language as an implementation for reading JSON, so this keeps in line with keeping the barrier of entry hurdle low for would-be service developers.
The _type_ and _origin_ field will provide enough information to determine which decoding should be used on the raw data to extract out the meaningful information for the service wanting to read this data. If the data did not originate from another service, but rather from itself, it should use the name of the service to express this (rather than something like "self", so that no conversions need to be made if the data is ever transmitted). The conventional use case for the _data_ field is to hold a JSON encoding of the raw data. Almost every language as an implementation for reading JSON, so this keeps in line with keeping the barrier of entry hurdle low for would-be service developers.
#### Rationale
While the schematics for this table appears to be a lack of schematics, it accomplishes the following important tasks that are needed in sharing data in a distributed network:
- How do I reference this data in the network? (i.e. the _uuid_ field)
- What scope does this data belong to? (i.e. the _type_ field)
- Where did this data come from? (i.e. the _from_ field)
- Where did this data originate from? (i.e. the _origin_ field)
Along with answering the questions answered by the common fields, this solves a bulk of issues with data management while also putting little restriction on the data itself by only limiting that it can be encoded to binary (which should be true of any data structure).
@ -93,7 +93,7 @@ Where the raw data table is an open-ended system of holding arbitrary data for i
- Body: the contents of the entry, represented as an unbounded text blob
- Data: an optional reference to a raw data table entry, represented using the _uuid_ field (but can be store using the _id_ field)
This table is shares many similarities to the raw data table in the fields it provides, but its intended usage is far more rigid. The _type_ field here shares the same purpose as in the raw data table, but instead of a _from_ field there is a _name_ field. This is because all refined data belongs to and originates from the providing service (so no need to ask what service the data came from) as well as these entries should be the data consumed by humans (and as such will need a label to differentiate it from all the other provided data). The _body_ field is the meat of the entry containing a series of unicode (UTF-8) characters. Entries in this table can be used to express a to-do task, quick note, blog article, or even a pages of a book. These entries can also be a refined version of previous raw data, which is what the optional _data_ field is used to link back to. However, this can also be used as a means to extend the entry with additional metadata if it does not fit into the fields provided by the refined data table.
This table is shares many similarities to the raw data table in the fields it provides, but its intended usage is far more rigid. The _type_ field here shares the same purpose as in the raw data table, but instead of a _origin_ field there is a _name_ field. This is because all refined data belongs to and originates from the providing service (so no need to ask what service the data came from) as well as these entries should be the data consumed by humans (and as such will need a label to differentiate it from all the other provided data). The _body_ field is the meat of the entry containing a series of unicode (UTF-8) characters. Entries in this table can be used to express a to-do task, quick note, blog article, or even a pages of a book. These entries can also be a refined version of previous raw data, which is what the optional _data_ field is used to link back to. However, this can also be used as a means to extend the entry with additional metadata if it does not fit into the fields provided by the refined data table.
#### Conventions
@ -145,20 +145,20 @@ The following template is the valid JSON structure for entries from the data tab
"updated": 1608518551,
"flag": 0,
"type": "example",
"from": "specs",
"origin": "specs",
"data": "{\"raw\":\"data\"}"
}
```
```json
{
"uuid": "d04b98f48e8f8bcc15c6ae5ac050801cd6dcfd428fb5f9e65c4e16e7807340fa",
"uuid": "d04b98f48e8f8bcc15c6ae5ac050801cd6dcfd428fb5f9e65c4e16e7807340fb",
"added": 1608518551,
"updated": 1608518551,
"flag": 0,
"type": "example",
"name": "Refined Data Example",
"data": "{\"hello\":\"world\"}"
"data": "d04b98f48e8f8bcc15c6ae5ac050801cd6dcfd428fb5f9e65c4e16e7807340fa"
}
```
@ -171,7 +171,7 @@ Transmitting the tagging data is special, as the SQL representation of the relat
"updated": 1608518551,
"flag": 0,
"type": "example",
"from": "specs",
"origin": "specs",
"data": "{\"raw\":\"data\"}",
"tags": ["raw", "data"]
}