MTS wasn't in use for production when the world reached the Y2K deadline.
I remember hearing that when they were shutting down one of their MTS systems for the last time, the folks at UBC set the TOD clock ahead to sometime in the 21st century. What I don't remember is what the outcome of that experiment was. Can someone from UBC provide the exciting conclusion to the story?
And I think Mike Alexander mentioned recently that he had to reassemble some component or other due to a date/time issue when he was running MTS under Hercules. It might have been the Tape routines. Mike, can you confirm that?
Are there other MTS related date/time issues that people remember?
<< Previous Next >>
Y2K and other date/time issues?
I, too, recall that this "travelled with the Sun", so it hit us (the UK) first, then UMICH (and presumably RPI, etc.) then finally UBC/SFU. (A pity that the MTS community hadn't had that prospective site in Yugoslavia!)
What a coincidence! Just a few days ago at my post-Durham workplace (ECMWF, Berkshire) I was chatting to someone about this, having not mentioned it for years.I, too, remember that incident. This cropped up simultaneously at NCL and DUR. I think the person who figured it out was Mike Ellison at Durham, who, as I recall, had called in sick that morning, but came in to address it and successfully diagnosed it, despite looking very poorly. I think there was similar activity at NCL and some sort of similar difficulty of absence of the key personnel to address this, although I think there was someone (I forget who) who set to work on it at NCL.
Of course, seen in retrospect from today, this makes MTS look like the Windows of its time, not like the UNIX of its time. (Boo! Hiss! I hear you cry.) Why? Because MTS, like the Windows of yesteryear, did its timekeeping via local time, not by setting its clock to UTC and applying a timezone offset. Had MTS been like UNIX, then we would all have been in it together...
Dave Mills worked at the UM Computing Center in the late 1960s and early 1970s where he developed the PDP-8 based Data Concentrator and what was almost certainly the first non-IBM implementation of a S/360 control unit to I/O channel interface. After Dave left UM he went on to do many important things related to satellite and data communication and what would become today's Internet. Among his work is the design of the Network Time Protocol (NTP). NTP is used by pretty much every computer in the world that is connected to the Internet. In 2008 Dave was elected to the National Academy of Engineering for "contributions to Internet timekeeping and the development of the Network Time Protocol".There is also a bio sketch for Dave in the People section of this web site.
A Wikipedia article gives more information about Dave.
Starting with the second NTP RFC the RFCs include a fascinating collection of information on time and dates that goes well beyond what is strictly necessary for the implementation of an Internet protocol. The information has been edited and reworked in each of the three NTP RFCs:
RFC 958 - Network Time Protocol (NTP), September 1985And Dave has written a book:
Does not include separate sections with timescale and chronometry information
RFC 1059 - Network Time Protocol (Version 1), July 1988
See section "2.3 Time Scales"
RFC 1119 - Network Time Protocol (Version 2), September 1989
See sections "2.3. The NTP Timescale", "2.4. The NTP Calendar", and "2.5. Time and Frequency Dissemination"
RFC 1305 - Network Time Protocol (Version 3), March 1992
See "Appendix E. The NTP Timescale and its Chronometry"
Computer Network Time Synchronization: the Network Time Protocol, CRC Press 2006, 304 pp.
From Dave's description of his book:
Chapter 13 describes how we reckon the time according to the stars and atoms. It explains the relationships between the international timescales TAI, UTC and JDN dear to physicists and navigators and the NTP timescale. If we use NTP for historic and future dating, there are issues of rollover and precision. Even the calendar gets in the act, as the astronomers have their ways and the historians theirs. Since the topic of history comes up, Chapter 15 reveals the events of historic interest since computer network timekeeping started over two decades ago.
<< Previous Next >>
I received this e-mail from Doug Wade at UBC:
From: Doug Wade <@UBC>
Date: September 20, 2010 2:35:17 AM EDT
To: Jeff Ogden
Subject: MTS - I need help
Your MTS contributions to both Wikipedia and the MTS archive are really cool. I've worked at UBC as a "computer operator" since 1981. Except for Dennis O'Reilly that makes both me and Dean Main the dinosaurs of UBC IT (Computing Centre -> Computing Services -> IT Services -> UBC IT). I'd love to help you out with the MTS stuff.
. . .
4) You asked about the final IPL of MTS at UBC. I was there. We put it into the future (Y2K + a few years). My memory is nothing in the system broke and we were quite surprised. I had asked about doing that previous to the final shutdown and was told "dont you dare" because it might break things to the point that the system would never come up come up again if needed,
. . .
Keep in touch and any way I can help is no problem. BTW, I would die and go to heaven if I ever managed to get a version of MTS running on my iMac!
I found the following on the "Anecdotes" page of Josh Simon's Web site and the change log looks pretty real. Unlike some of the other materials about MTS at this site these items are not from the 13 May 1996 issue of UM's IT Digest (the "goodbye to MTS issue").
In November of 1989, a minor itsy-bitsy bug was discovered in the MTS code. Nothing you'd call major, really. Seems that the United Kingdom-based MTS sites were having all sorts of file system-related problems. Luckily for us in the United States, we had 5 hours before it became midnight locally. The problem was that some of the file system code used an unsigned half-word integer (16 bits) to store the number of days since zero time (March 1, 1900). Unfortunately, the rest of the file system code used a signed half-word integer (15 bits data, 1 bit sign) — and when it became the 32,768th day after zero time, the sign bit flipped and parts of the system thought files were stamped as being created or modified 32,767 days in the future. MTS didn't like this concept, so it caused all sorts of system problems. (The change log comments are available.)
The systems programmers hurriedly patched the file system code to use unsigned half-word integers consistently, recompiled the operating system, and provided patches to the various MTS Consortium sites. (Hewlett-Packard was using a previous version of MTS — Distribution 5.1 instead of the then-current Distribution 6.0 — at one of their sites. We provided them with a binary-only version of the patch and informed them not to trust any previous backups of the operating system.)
Of course, as the senior programmer noted on the systems programmers' mailing list, this solution will only work until the 65535th day after zero time (which maps out to some time in 2061). His comment was that if anyone was still running what would in effect be a century-old operating system then that they got what they deserved. And besides, by 2061, all of the then-current systems programmers would be retired or deceased, so they really didn't much care. (Shades of the Year 2000 problem, huh?)And the change log:
More items from Risks.
From the Risks Digest, Volume 17, Number 19, 19 June 1995:
Thu, 15 Jun 1995 19:45:13 -0400
I, too, am glad to see that Multics is still used. It is a system that was far ahead of its time in many respects.
In MTS (the Michigan Terminal System), a system contemporaneous with Multics which is also still in use, we solved the problem of the operators entering a bad time in a slightly different way. During initialization, the system compares the current time with the time in the last billing record recorded. If the current time is earlier or too much later (more than 12 hours, unless the day is Sunday in which case 18 hours is ok) it complains and asks the operators to confirm that the time is ok. This has several advantages: it doesn't use hard-coded dates, it is a more precise check, and it never makes the system unusable. Of course this has become less important as modern machines maintain the time of day even when not running and the clock rarely needs to be set at all.
Mike Alexander, Univ. of Michigan
And an earlier note to Risks, Volume 17, Number 18, 15 June 1995 that Mike was probably responding to:
8 Jun 1995 22:45:59 GMT
[A woodka tonic forwarded to RISKS by Donna Woodka, who probably knows my penchant (or even pun-chant) for Multics tales. Thanks to Bernard for having fortifived our archives and providing evidence that Multics still lives! PGN]
Ward Anderson at ACTC just reported an interesting crash on Multics (10.2) at ACTC -- Collection 1 initialization discovered that I became 45 years old Tuesday past, an event which was extremely unlikely, and crashed the system before the clock did damage to the file system, or so it feared.
The code in scs_and_clock_init is perfectly clear - the time "06/06/95 18:31 est Tuesday" is hard-coded in, in characters, with the comment that it is "Bernard S. Greenberg's 45th birthday". It has been there for twenty years in plain text visible to anyone reading the code! (I loved to read code in my day, especially initialization - perhaps I was the last?)
Maybe Tom Van Vleck remembers, but it is extremely likely that twenty years ago at CISL our operator at the time for the nth and last time forgot to set the clock, or set it poorly, and damaged the file system (which looks quite askance on "back to the future" jaunts), and Tom and I said "This has to end. We have to put a gullibility check in the clock init code", and I did this. Probably saved a lot of file system damage over the years. If I had it to do over again, I'd do it over again! This code did the -right-thing-!
At 25, I could not imagine I'd ever be 45, let alone that scs_and_clock_init.pl1 would be there along with me! Somehow, though, 65 doesn't seem that far away any more...
As Ward said, this is a -real- Multics story.Bernie
I found the following. I think it is Brian Randell's initial post to Risks about this event. It is still hard to figure out what, if anything, crashed and if the crash or other problem actually occurred on both sides of the Atlantic or if the warning from the UK came soon enough to save the rest of us some embarrassment. The post talks about an "unexpected system shutdown" and a "bug", but doesn't use the word "crash".
From the Risks Digest, Volume 9, Number 45, 20 November 1989:
Fri, 17 Nov 89 9:17:33 BST
We apologise for the unexpected system shutdown today (Thursday). This was caused by a bug in the MTS system that was a "time-bomb" in all senses of the word. It was triggered by today's date, 16th November 1989. This date is specially significant. Dates within the file system are stored as half-word (16 bit) values which are the number of days since the 1st March 1900. The value of today's date is 32,768 decimal (X'8000' hexadecimal). This number is exactly 1 more than the largest positive integer that can be stored in a half-word (the left-most bit is the sign bit). As a result, various range checks that are performed on these dates began to fail when the date reached this value. The problem has a particular interest because all the MTS sites world-wide are similarly affected. Durham and Newcastle were the first to experience the bug because of time zone differences and we were the first to fix it. The American and Canadian MTS installations are some 4 to 8 hours behind us so the opportunity to be the first MTS site to fix such a serious problem has been some consolation. The work was done by our MTS specialist who struggled in from his sick bed to have just that satisfaction!Does anyone remember who the "MTS specialist" was?
And a little more from the next Risks digest, Volume 9, Number 46, 22 November 1986. PNG is the risks moderator Peter G. Neumann.
Another Foretaste of the Millenium? (RISKS-9.45, corrigenda)Brian Randell <Brian.Randell@newcastle.ac.uk>
Tue, 21 Nov 89 10:12:20 BST
[Brian sent me two versions of the MTS saga, part of one of which ran in RISKS-9.45 -- but without the explanation indicating that the MTS message was not from Brian but rather from someone else. The surrounding text is given below, in case anyone thought that the "We apologise ..." message was originally Brian's. I apology to Brian in case anyone was misled. PGN] The university computing service here runs MTS (the Michigan Terminal System) on an Amdahl mainframe, which crashed mysteriously today, as did various other MTS sites in North America, some time later. The explanation is given in the following message which I have just received from one of the systems programmers here. > We apologise for the unexpected system shutdown ... [see RISKS.9-45 for text.] I hadn't realised that there was this disadvantage to living on this side of the Atlantic! Ah, well, it makes up for various advantages :-) Brian RandellThis note does use the word "crash" and says that there were problems on both sides of the Atlantic.
It wasn't a Y2K issue exactly, but one MTS date and time issue that is widely mentioned on the Web had to do with a halfword integer overflow first encountered by the folks at NUMAC.
From Computer-Related Risks: Excerpts on Computer Calendar-Clock Problems, Peter G. Neumann, Computer Science Laboratory, SRI International:
Overflows. The number 32,768 = 215 has caused all sorts of grief that resulted from the overflow of a 16-bit word. ... Brian Randell reported that the University of Newcastle upon Tyne, England, had a Michigan Terminal System (MTS) that crashed on 1989 Nov 16, 215 days after 1900 Mar 01. Five hours later, MTS installations on the U.S. east coast died, and so on across the country, an example of a genuine (but unintentional) distributed time bomb.
I'd left the UM Computing Center for Arbortext when this took place, but I heard about it. The story I remember was a little different. It was that the five to eight hour time difference between Newcastle and sites in North America allowed the Newcastle folks to spread the word and get the North American MTS systems patched in time to avoid the problem. Is my memory any good or is this just wishful thinking on my part?
Tony Young included this ps on a note that he sent me back in August:
I'm sure MTS anecdotes are totally inappropriate to your article, but do you remember the 31-bit MTS date overflow problem - perhaps one of the few advantages of having MTS systems in England who hit the problem some 5(6?) hrs earlier and were able to give early warning?
Anecdotes may have been inappropriate for the Wikipedia article, but they are just the thing for this web site.
And George Helffrich sent me this note yesterday (13Sep2010):
I don't think the system crashed; I think it was *FILESAVE that was unavailable due to 16 bit integer days in the directory of file versions saved. It would be interesting to unearth newsletters to see what actually happened. Viktors Berstis would probably remember; he wrote the original code, though had left for IBM by then.
There are copies of the MTS Newsletters at UM's Bentley Historical Library, so I can check them there. But I'm guessing that by 1989 this would have been recorded in CONFER, *FORUM, or sent via e-mail and not included in the paper newsletter, if we were still publishing the paper version in 1989.
Does anyone know where we can find Brian Randell's initial report? Or does anyone remember the details of this event?