Mysterious TIME situation in MySQL

Approx. transl. : This detailed analysis of one seemingly insignificant detail in the implementation inside MySQL caused natural discussions about the correctness in approaches to the development of a well-known Open Source project in general. What the Portuguese engineer actually found out, he tells in a format close to a detective story ...



Many in 2020 fell victim to a strange phenomenon of time perception, but some database management systems manipulate time for much longer. I first noticed this when a friend of mine in one of his projects ( Accord is a popular Discord bot) ran into the following exception from the MySQL connector when used with EF Core:



MySqlException: Incorrect TIME value: '960:00:00.000000'


Not too MySQL-savvy (as I prefer PostgreSQL for reasons that will soon become apparent), I thought for a second that the number of hours was wrong. It is reasonable to assume that TIME values ​​are limited to 24 hours, or that values ​​spanning multiple days require a different syntax - for example, 40:00:00:00would represent 40 days. But the reality turned out to be much more complicated and confusing.



The next obvious step was to check the MySQL documentation . It read:



MySQL receives and displays TIME values ​​in 'hh: mm: ss' format (or in 'hhh: mm: ss' format for large hourly values).


So far, everything is fine: our problematic TIME value fits well into this format, although the fact that hhthey hhhare specified explicitly raises suspicions (what about clock values ​​exceeding 999?). The next sentence in the documentation partially explains everything, along the way stimulating a bunch of questions like "What the ...?":



TIME values ​​can range from '-838: 59: 59' to '838: 59: 59'.


Well okay ... Some strange range. There must be a good technical reason for this. 839 hours is 34.958 (3) days, and the entire range is exactly 6040798 seconds. The documentation reads as follows:



MySQL recognizes TIME values ​​in several formats, some of which can include fractional seconds up to 6 decimal places (microseconds).


In other words, the entire interval is 6,040,798,000,000 microseconds. Again, some strange number. It is far from a power of two (between 2 42 and 2 43 ), so MySQL appears to be using some unique internal representation format. But before I get into this issue, let me point out how bad this type is.



That's all MySQL has to offer for measuring time intervals, with the entire time span just a little over one month. How big is this "little bit"? As you can see, it is not even a multiple of an integer number of days.



Worse, the most popular MySQL to EF Core provider converts .NET TimeSpanto TIME by default , despite the fact thatTimeSpancan contain intervals of tens of millennia (it uses 64-bit integers, and the allowed precision is 10 -8 s). Compare this to a couple of months in TIME. Other people have encountered



this problem , and the discussion in the corresponding issue contains a reference to the behavior of SQL Server: "This mimics the behavior of SQL Server". I checked - indeed, the SQL Server time type has a range from 00: 00: 00.0000000 to 23: 59: 59.9999999, which is generally much more reasonable than the strange TIME range. But let's get back to MySQL. What is the reason for such an unusual range? In the MySQL device manual



says that in version 5.6.4 the TIME type has changed and there is support for fractions of seconds. Three bytes are used for the whole part. If these three bytes are used entirely to encode seconds, this results in a time span of more than 2,330 hours - much more than the current maximum of 838 hours (although even that is not very useful when converting TimeSpan'a).



This means that the process that encodes the time in MySQL is wasting bits - perhaps for the sake of ease of use (although I'm not sure under what circumstances this is relevant). Maybe this makes sense if the DBMS (and the developers' idea of ​​what users will do with it) is geared towards working with strings, and the developers want to speed up the presentation hh:mm:ss.



So see:



1 — (1 = , 0 = )

1 ( )

10 — (0-838)

6 — (0-59)

6 — (0-59)

— 24 = 3


That explains everything, doesn't it? Well, let's take a closer look. 10 bits for hours ... and the range is from zero to 838. I hasten to remind you that 2 10 = 1024, not 838. The intrigue is gaining momentum ...



Of course, I'm not the first person who asked this question (I have already asked about this on StackOverflow ). Everything seems to be stated in the "accepted" answer there, however, the strange choice of 838 hours is first explained by "backward compatibility with applications that were written quite a long time ago", and only then it is mentioned that this has something to do with compatibility with MySQL 3 - by the way Windows 98 was then considered a novelty, and Linux was not even 10 years old.



In MySQL 3, the TIME type also used 3 bytes, only it did it in a completely different way. One of the bits was also reserved for the sign, but the remaining 23 bits corresponded to integers, obtained as follows: hours × 10,000 + minutes × 100 + seconds. In other words, the two least significant digits were seconds, the next two were minutes, and the remaining two were hours. 2 * 23 is 83888608, which is 838: 86: 08, so the maximum valid time value in this format is 838: 59: 59.



This format is even less convenient than the current one, since it requires multiplication and division for almost any time operation (with the exception of string formatting and parsing - which once again proves that MySQL pays too much attention to string IO and does not really care about the presence of types. which would be handy for internal operations and non-string based protocols).



The MySQL developers have been able to fix this type many times, or at least provide an alternative one that is free of the existing limitation. The TIME type has changed twice since MySQL 3 until today, but each time the strange range has remained the same - perhaps for compatibility reasons.



I am at a loss to imagine a situation where expanding the range of a value for a type could break application compatibility: do types in MySQL have specific overflow behavior? What sane programmer would rely on internal database type constraints to validate anything in their application? If there is such a person, why on earth would he suddenly decide to transfer this ridiculous 838 hours limit into the data model of his application without any changes? To be honest, I don't even want to know the answers to these questions.



Despite a couple of major transformations in the history of MySQL, the TIME type is still awkward and limited. And the highlight of the program here, in my opinion, is the unused bit "reserved for future extensions." I hope that in the long term it will point to the old, legacy TIME, and by then MySQL and / or MariaDB will have a sensible time type such as INTERVAL in PostgreSQL , which has a range of ± 178,000,000 years and a microsecond accuracy.



PS from translator



Read also on our blog:






All Articles