TTSB custom feeds #8

Closed
opened 2020-03-31 20:23:23 +00:00 by lucidiot · 1 comment
Owner
Language English Taiwanese
Aviation link link
Marine link link
Rail link link
Highway link link

Note that marine, rail and highway report lists are all empty in both languages for some reason.

Language | English | Taiwanese -------- | ------- | --------- Aviation | [link](https://www.ttsb.gov.tw/english/16051/16052/16053/16058/Lpsimplelist?Page=1&PageSize=1000&type=) | [link](https://www.ttsb.gov.tw/1133/1154/1155/1159/Lpsimplelist?Page=1&PageSize=1000&type=) Marine | [link](https://www.ttsb.gov.tw/english/16051/16052/16062/16067/Lpsimplelist?Page=1&PageSize=1000&type=) | [link](https://www.ttsb.gov.tw/1133/1154/1163/22877/Lpsimplelist?Page=1&PageSize=1000&type=) Rail | [link](https://www.ttsb.gov.tw/english/16051/16052/16071/16076/Lpsimplelist?Page=1&PageSize=1000&type=) | [link](https://www.ttsb.gov.tw/1133/1154/1168/22890/Lpsimplelist?Page=1&PageSize=1000&type=) Highway | [link](https://www.ttsb.gov.tw/english/16051/16052/16080/16085/Lpsimplelist?Page=1&PageSize=1000&type=) | [link](https://www.ttsb.gov.tw/1133/1154/1173/22903/Lpsimplelist?Page=1&PageSize=1000&type=) Note that marine, rail and highway report lists are all empty in both languages for some reason.
Author
Owner

An interesting note on some things I am encountering while parsing the Chinese feeds:

  • The language code requires two subtags: zh-Hant-TW to tell this is Traditional Chinese from Taiwan. Knowing this required me to read two RFCs, look up two ISO standards and one IANA registry.
  • The feeds use the Republic of China Calendar, which is apparently just the Gregorian calendar but 1911 years late, and requiring some strange parsing that might break in 1998 years.

Apart from this weirdness, the attributes on the <tr> tags are actually pretty nice and make the rest of the parsing easy!

Due to there not being any investigations other than on aviation in either language, I will not create feeds for those as I am unsure what the HTML will look like when parsing it and what data it will hold. If, in the future, someone finds out the TTSB now has actual investigations in another type, please contact me by commenting on this issue or by any other means.

An interesting note on some things I am encountering while parsing the Chinese feeds: * The language code requires two subtags: `zh-Hant-TW` to tell this is Traditional Chinese from Taiwan. Knowing this required me to read two RFCs, look up two ISO standards and one IANA registry. * The feeds use the [Republic of China Calendar](https://en.wikipedia.org/wiki/Republic_of_China_calendar), which is apparently just the Gregorian calendar but 1911 years late, and requiring some strange parsing that might break in 1998 years. Apart from this weirdness, the attributes on the `<tr>` tags are actually pretty nice and make the rest of the parsing easy! Due to there not being any investigations other than on aviation in either language, I will not create feeds for those as I am unsure what the HTML will look like when parsing it and what data it will hold. If, in the future, someone finds out the TTSB now has actual investigations in another type, please contact me by commenting on this issue or by any other means.
lucidiot self-assigned this 2020-07-25 20:07:39 +00:00
lucidiot added the
feed
label 2020-07-26 13:53:21 +00:00
Sign in to join this conversation.
No description provided.