Abstract: Let's take a closer look at the organization rules of the universal time zone database, how time zone and daylight saving time are maintained, and how they are used in GaussDB (DWS).
This article is shared from the HUAWEI CLOUD community " What You Should Know About Time Zones", the original author: leapdb.
1. Background introduction
Next, let's take a deeper look at the organization rules of the universal time zone database, and how the time zone and daylight saving time are maintained, and how they are used in GaussDB (DWS).
2. Introduction to Universal Time Zone Database
The local time zone and daylight saving time rules are independently managed by their respective governments, and they often change with limited notice. Moreover, their historical data and future plans are only intermittently recorded. The Universal Time Zone Database attempts to organize and organize relevant data in this field.
The time zone database, usually called tz, tzdata or zoneinfo, is a set of historical information containing a large number of codes and data used to represent the local time of many representative locations around the world. It will determine the time zone boundaries and daylight saving time rules of each government Change and update from time to time. Each entry in the database represents the time zone information of this widely recognized civilian clock since 1970. The database is referenced by many projects, such as: the GNU C Library (used in GNU/Linux), Android, FreeBSD, NetBSD, OpenBSD, Chromium OS, Cygwin, MariaDB, MINIX, MySQL, webOS, AIX, BlackBerry 10, iOS, macOS , Microsoft Windows, OpenVMS, Oracle Database, and Oracle Solaris. GaussDB (DWS), like other widely used software products, also uses the universal time zone data maintained by IANA.
The database was founded by David Olson and edited and maintained by Paul Eggert. Therefore, it is also called Olson database in some places. Its notable feature is a set of universal time zone naming rules designed by Paul Eggert. Each time zone is given a unique name in the format of "area/location", such as "America/New_York". The space in the English place name is replaced with an underscore "_", and the conjunction "-" is only used when the English place name itself is included. Time zone databases are currently generally called Olson time zone database or IANA time zone database.
Olson's data has changed, partly because of the imminent retirement of AD Olson, and partly because of a lawsuit (now revoked) against defenders for copyright infringement. On October 14, 2011, the Internet Corporation for Assigned Names and Names (IANA) took over the maintenance of the time zone database. It is regularly updated to reflect changes to time zone boundaries, UTC differences, and daylight saving time rules by various political entities. Updates to tz are managed in accordance with the BCP 175 process.
Some countries often change their time zone rules. IANA releases the latest time zone data and parsing source code libraries every year. IANA provides three methods to access the time zone database:
- https://www.iana.org/time-zones
- ftp://ftp.iana.org/tz/
- rsync://http://rsync.iana.org/tz/
The time zone database contains text files of the original time zone definitions of all continents and code files for parsing these text files.
2.1 Universal time zone database source code
related information:
源码托管地址:https://github.com/eggert/tz
时区数据库的介绍 https://data.iana.org/time-zones/tz-link.html
时区数据库原理及使用 https://data.iana.org/time-zones/theory.html
download method:
mkdir tzdb
cd tzdb
wget https://www.iana.org/time-zones/repository/tzcode-latest.tar.gz #下载最新解析时区文本定义的代码文件
wget https://www.iana.org/time-zones/repository/tzdata-latest.tar.gz #下载最新时区定义的文本文件
gzip -dc tzcode-latest.tar.gz | tar -xf -
gzip -dc tzdata-latest.tar.gz | tar -xf -
#或者下载代码+数据的完整压缩包
wget https://www.iana.org/time-zones/repository/tzdb-latest.tar.lz #下载代码+数据的完整时区数据库
lzip -dc tzdb-latest.tar.lz | tar -xf -
code structure:
Time zone definition files (text files) contained in the time zone database:
africa antarctica asia australasia europe northamerica southamerica
The code file contained in the time zone database (used to parse the time zone definition file)
asctime.c date.c difftime.c localtime.c strftime.c zdump.c zic.c
installation directory structure:
The installation directory structure after compilation is as follows
cd tzdb-2020a
make TOPDIR=$HOME/tzdir install
leap@gaussdb:~> ./sbin/tree -L 3 tzdir/ | more
tzdir/
├── etc
│ └── localtime #本地时区文件
└── usr
├── bin
│ ├── tzselect #设置时区的工具
│ └── zdump #以文本展示某个时区变化历史的工具
├── lib
│ └── libtz.a #解析时区文件的静态库文件
├── sbin
│ └── zic #时区编译器,可将时区定义的文本文件编译成二进制时区文件
└── share
├── man
├── zoneinfo #编译好的各个时区文件
├── zoneinfo-leaps
└── zoneinfo-posix -> zoneinfo
#IANA的时间数据库被the GNU C Library (used in GNU/Linux)采用,因此install后的目录也按linux的系统目录来组织。
The zic.c in the code is a tool that parses the original text file defined by the time zone into a binary time zone file. If a software product needs to obtain the latest time zone data from IANA, it needs to obtain the source code and data at the same time, and generate a zic parser to generate a time zone file from the time zone data for the software product to use. The parsing code of the time zone file is in localtime.c
2.2 Rules and maintenance methods of time zone raw data
The RAW file of the time zone database is a text file organized according to certain rules. The sample is as follows:
The rule is designed scientifically and can record the historical information of time zone and daylight saving time changes. Using this universal time zone library can automatically convert historical time, so this library is widely used.
Here is the definition of Moscow's time zone
Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone Europe/Moscow 2:30:17 - LMT 1880
2:30:17 - MMT 1916 Jul 3 # Moscow Mean Time
2:31:19 Russia %s 1919 Jul 1 0:00u
3:00 Russia %s 1921 Oct
3:00 Russia MSK/MSD 1922 Oct
2:00 - EET 1930 Jun 21
3:00 Russia MSK/MSD 1991 Mar 31 2:00s
2:00 Russia EE%sT 1992 Jan 19 2:00s
3:00 Russia MSK/MSD 2011 Mar 27 2:00s
4:00 - MSK 2014 Oct 26 2:00s
3:00 - MSK
- From 2 am on January 19, 1992 to 2 am on March 27, 2011, the third district will be adopted
- From 2 am on March 27, 2011 to 2 am on October 26, 2014, the East Fourth District is adopted
- East Third District will be adopted after 2 a.m. on October 26, 2014
How to read the RAW file time zone database https://data.iana.org/time-zones/tz-how-to.html
VSCode's Zoneinfo plug-in can view the RAW file of time zone data by syntax highlighting
The basic function of is to provide original time zone data files and a compiler that converts the original time zone data files into time zone files: zic.
You can learn about the use of the time zone file compiler (compiling the time zone definition text file into a binary time zone data file) through man zic
3. Universal time zone database usage
The specific definition of each time zone name exists in the time zone file, such as: /etc/localtime.
The time zone files supported by the operating system are stored in the /usr/share/zoneinfo/ directory.
The time zone files supported by our database are stored in the share/timezone directory.
The time zone file has a uniform format requirement (from the time zone database maintained by IANA), which can be viewed using info tzfile. Time zone file fixed beginning structure:
struct tzhead
{
char tzh_magic[4]; /* TZ_MAGIC,固定在开头的特征字符"TZif"来标识时区文件 */
char tzh_version[1]; /* '\0' or '2' or '3' as of 2013,版本信息 */
char tzh_reserved[15]; /* reserved; must be zero */
char tzh_ttisutcnt[4]; /* coded number of trans. time flags 保存在文件中的UTC/local指示器数目*/
char tzh_ttisstdcnt[4]; /* coded number of trans. time flags 保存在文件中的standard/wall指示器数目*/
char tzh_leapcnt[4]; /* coded number of leap seconds 其值保存在文件中的leap second的数目*/
char tzh_timecnt[4]; /* coded number of transition times 其值保存在文件中的"变化时间"数目*/
char tzh_typecnt[4]; /* coded number of local time types 其值保存在文件中的"本地时间类型"数目(非零!)*/
char tzh_charcnt[4]; /* coded number of abbr. chars 保存在文件中的"时区简写符"数目*/
};
Following these headers are tzh_timecnt four-byte long values in "standard" byte order, sorted in ascending order. Each value is treated as a change time (like the return of time(2)), system dependent These values are used to calculate the local time change.
After this, there are tzh_timecnt one-byte values of unsigned char type. These values indicate which of the multiple "local time" types described in the file is related to the change time with the same index. These values can be used as a ttinfo structure The index of the array.
The ttinfo structure is subsequently defined in the file and described as follows:
struct ttinfo {
long tt_gmtoff;
int tt_isdst;
unsigned int tt_abbrind;
};
The structure includes a four-byte long value tt_gmtoff in "standard" byte order,
And a one-byte tt_isdst
And a one-byte tt_abbrind.
In each structure, tt_gmtoff gives the time to be added to UTC, in seconds,
tt_isdst indicates whether tm_isdst can be set by localtime (3),
And tt_abbrind can be used as an array index of the time zone abbreviation, which follows the ttinfo structure in the file.
There are tzh_leapcnt four-byte pairs in standard byte order. The first value of each four-byte pair gives the time when a leap second occurs, just like the return of time(2); each four-byte pair The second value of gives the total number of leap seconds achieved after a given time. The four-byte pairs are sorted in ascending order of time.
There are also tzh_ttisstdcnt standard/wall indicators, each holding a one-byte value; these indicators indicate whether the change time (related to the local time type) is specified as standard time or wall clock time, and when it is a time zone file Whether to use change time when processing POSIX format time zone environment variables.
Finally, there are tzh_ttisgmtcnt UTC/local indicators, each holding a one-byte value; these indicators indicate whether the change time (related to the local time type) is described as UTC or local time, and when a time zone file is Whether to use change time when processing POSIX format time zone environment variables.
If tzh_timecnt is equal to zero or the time parameter is less than the first change time recorded in the file, Localtime uses the first standard time ttinfo in the file, or if there is no standard time structure, it directly uses the first ttinfo structure.
You can use man zdump dump binary time zone data file to understand how to view the change history of a time zone
./zdump -V Asia/Chongqing | more
Asia/Chongqing Sat Dec 31 16:53:39 1927 UT = Sat Dec 31 23:59:59 1927 LMT isdst=0 gmtoff=25580
Asia/Chongqing Sat Dec 31 16:53:40 1927 UT = Sat Dec 31 23:53:40 1927 LONT isdst=0 gmtoff=25200
Asia/Chongqing Wed Apr 30 16:59:59 1980 UT = Wed Apr 30 23:59:59 1980 LONT isdst=0 gmtoff=25200
Asia/Chongqing Wed Apr 30 17:00:00 1980 UT = Thu May 1 01:00:00 1980 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat May 3 15:59:59 1986 UT = Sat May 3 23:59:59 1986 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat May 3 16:00:00 1986 UT = Sun May 4 01:00:00 1986 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 13 14:59:59 1986 UT = Sat Sep 13 23:59:59 1986 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 13 15:00:00 1986 UT = Sat Sep 13 23:00:00 1986 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 11 15:59:59 1987 UT = Sat Apr 11 23:59:59 1987 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 11 16:00:00 1987 UT = Sun Apr 12 01:00:00 1987 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 12 14:59:59 1987 UT = Sat Sep 12 23:59:59 1987 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 12 15:00:00 1987 UT = Sat Sep 12 23:00:00 1987 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 9 15:59:59 1988 UT = Sat Apr 9 23:59:59 1988 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 9 16:00:00 1988 UT = Sun Apr 10 01:00:00 1988 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 10 14:59:59 1988 UT = Sat Sep 10 23:59:59 1988 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 10 15:00:00 1988 UT = Sat Sep 10 23:00:00 1988 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 15 15:59:59 1989 UT = Sat Apr 15 23:59:59 1989 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 15 16:00:00 1989 UT = Sun Apr 16 01:00:00 1989 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 16 14:59:59 1989 UT = Sat Sep 16 23:59:59 1989 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 16 15:00:00 1989 UT = Sat Sep 16 23:00:00 1989 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 14 15:59:59 1990 UT = Sat Apr 14 23:59:59 1990 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 14 16:00:00 1990 UT = Sun Apr 15 01:00:00 1990 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 15 14:59:59 1990 UT = Sat Sep 15 23:59:59 1990 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 15 15:00:00 1990 UT = Sat Sep 15 23:00:00 1990 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 13 15:59:59 1991 UT = Sat Apr 13 23:59:59 1991 CST isdst=0 gmtoff=28800
Asia/Chongqing Sat Apr 13 16:00:00 1991 UT = Sun Apr 14 01:00:00 1991 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 14 14:59:59 1991 UT = Sat Sep 14 23:59:59 1991 CDT isdst=1 gmtoff=32400
Asia/Chongqing Sat Sep 14 15:00:00 1991 UT = Sat Sep 14 23:00:00 1991 CST isdst=0 gmtoff=28800
Because our country has cancelled daylight saving time since 1991, it will not change anymore.
4. How to use time zone data in GaussDB (DWS)
For the convenience of domestic users, GaussDB (DWS) internally helps users define the Asia/Beijing time zone according to the grammatical rules defined by IANA, and its definition is consistent with the PRC time zone definition. Timezone/Asia/Beijing located in the installation directory
View the specific definition of Asia/Beijing:
./zdump -V Asia/Beijing
Asia/Beijing Sat May 3 15:59:59 1986 UT = Sat May 3 23:59:59 1986 CST isdst=0 gmtoff=28800
Asia/Beijing Sat May 3 16:00:00 1986 UT = Sun May 4 01:00:00 1986 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 13 14:59:59 1986 UT = Sat Sep 13 23:59:59 1986 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 13 15:00:00 1986 UT = Sat Sep 13 23:00:00 1986 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 11 15:59:59 1987 UT = Sat Apr 11 23:59:59 1987 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 11 16:00:00 1987 UT = Sun Apr 12 01:00:00 1987 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 12 14:59:59 1987 UT = Sat Sep 12 23:59:59 1987 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 12 15:00:00 1987 UT = Sat Sep 12 23:00:00 1987 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 9 15:59:59 1988 UT = Sat Apr 9 23:59:59 1988 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 9 16:00:00 1988 UT = Sun Apr 10 01:00:00 1988 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 10 14:59:59 1988 UT = Sat Sep 10 23:59:59 1988 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 10 15:00:00 1988 UT = Sat Sep 10 23:00:00 1988 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 15 15:59:59 1989 UT = Sat Apr 15 23:59:59 1989 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 15 16:00:00 1989 UT = Sun Apr 16 01:00:00 1989 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 16 14:59:59 1989 UT = Sat Sep 16 23:59:59 1989 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 16 15:00:00 1989 UT = Sat Sep 16 23:00:00 1989 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 14 15:59:59 1990 UT = Sat Apr 14 23:59:59 1990 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 14 16:00:00 1990 UT = Sun Apr 15 01:00:00 1990 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 15 14:59:59 1990 UT = Sat Sep 15 23:59:59 1990 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 15 15:00:00 1990 UT = Sat Sep 15 23:00:00 1990 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 13 15:59:59 1991 UT = Sat Apr 13 23:59:59 1991 CST isdst=0 gmtoff=28800
Asia/Beijing Sat Apr 13 16:00:00 1991 UT = Sun Apr 14 01:00:00 1991 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 14 14:59:59 1991 UT = Sat Sep 14 23:59:59 1991 CDT isdst=1 gmtoff=32400
Asia/Beijing Sat Sep 14 15:00:00 1991 UT = Sat Sep 14 23:00:00 1991 CST isdst=0 gmtoff=28800
Now that all the knowledge about time zone in GaussDB (DWS) has been introduced, I hope that in-depth learning from the principle and maintenance perspective can completely eliminate the doubts about time zone usage. If you have any related questions, please ask in the forum.
5. Summary
In summary, GaussDB (DWS), as a high-performance analytical database product for global users, supports time zone data in compliance with industry standards.
For more information about GuassDB (DWS), welcome to search "GaussDB DWS" on WeChat and follow the WeChat official account to share with you the latest and most complete PB-level data warehouse black technology~
Click to follow, and get to know the fresh technology of Huawei Cloud for the first time~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。