"The Paging Game" (or sometimes "The Thing King") describes an imaginary game that is played by the same rules that govern the virtual memory paging process that many computer systems use to "stretch" the effective sizes of their memories. As the article explains "Understanding the game is easier than understanding the actual paging process, because computers usually are surrounded by an air of confusion and mystery whereas games usually aren't." The article was written between 1972 and 1974 by Jeff Berryman, a systems programmer working on the Michigan Terminal System (MTS) operating system at the University of British Columbia Computing Centre. The article is often mistakenly attributed to MIT's Project MAC.
University computing center newsletters were widely shared in the 1970s and as a result the article was frequently reprinted and got into general circulation. It was widely distributed through the SHARE IBM User's Group and was included in at least two books, Charles T. Meadow's Applied Data Management [Wiley, 1976] and Peter van der Linden's Expert C Programming: Deep C Secrets [Prentice Hall, 1994]. A Google search for "Paging Game" "Thing King" in November 2012 returned over 360 results. Jeff Berryman has said that he considers the article to be in the public domain and so, as far as he is concerned, anybody can publish it anywhere they want. He also finds it funny that the only publication that ever got him any fame at all in the computing field was completely facetious. A discussion in the MTS Archive web site includes more information about the article, where and why it was written.
There are three parts to the article. Part I gives the game rules, Part II gives the corresponding rules for the actual process using computing terms instead of the terms of the game, and Part III discusses the object of the game and how one can make the paging mechanism work best for computer programs. It is common for part III and sometimes part II to be omitted when the article was reprinted. While the article describes virtual memory demand paging in a fairly generic way, all three parts of the article are slightly dependent on the details of the specific virtual memory architecture and hardware configuration at a specific site and time and Part III is slightly dependent on the computer operating system being used. As a result many different versions of the article were created with minor adjustments to more closely match the details of different vendors' virtual memory architectures, different sites' hardware configurations, and different operating systems.
The version that follows includes all three parts and is taken from the 19 June 1974 issue of the University of Michigan Computing Center Newsletter (Vol.4, No.7, pages 2-6) which is available from the Hathi Trust Digital Library. Some details are based on the virtual memory architecture for the IBM System 360 Model 67 mainframe computer, on the hardware configuration at the University of Michigan in 1974, and on a few aspects of the Michigan Terminal System (MTS) time-sharing operating system that was in use at the University of British Columbia and at the University of Michigan. The University of Michigan allows portions of its Newsletter to be reprinted without permission so long as the source is clearly acknowledged. To avoid misunderstandings, any abridgments of reprinted material should be indicated by ellipsis marks (. . .) and editorial interpolations should be enclosed in brackets.
A copy of the article is also available on Wikisource.
[Editor's Note: At the request of many users as well as our own staff members, the following article is reprinted from the University of British Columbia Computing Centre Newsletter. It was written by J. Berryman. and appeared most recently in its entirety in Vol. 6, Number 2, February 1974. Minor changes have been made in keeping with the minor differences that exist between UBC's Computing Center and our own. [June 1974.]]
PART ICertain computer systems (MTS, for one) use a process called paging to increase the effective memory capacities of their machines. To help explain this process we have invented a game that is played by the same rules as those that control the paging process. Understanding the game is easier than understanding the actual paging process, because computers usually are surrounded by an air of confusion and mystery whereas games usually aren't. Part I gives the game rules, Part II gives the corresponding rules for the actual process, and Part III discusses the object of the game and how you can make the paging mechanism work best for your programs.
The Crating GameRules
LONG LIVE THE THING KING!
In Part I we said that paging was a process by which some computer systems, including ours, "stretch" the effective sizes of their memories. We described an imaginary game that was played by the same rules that govern the paging process. In this part we'll list the rules again, this time using computing terms instead of the terms of the game. Compare each paging rule given below with the corresponding game rule.
The Paging SystemRules
(Part I game terms are in parentheses, following their real counterparts.)
Notice that without paging, the total number of bytes (things) that a computer can store is equal to the capacity of its real memory (workshop) alone. With paging, the total number of stored bytes (things) is equal to the real memory (workshop) capacity plus the paging devices (warehouses) capacity plus the disk (dump) capacity. In our case, the approximate figures are:
Paged memory is called virtual memory to distinguish it from ordinary (real) memory.
Virtual memory is MUCH cheaper than equal amounts of real memory. Real memory is at least ten times as expensive as drum and disk memory. Consequently, our virtual memory, which consists mostly of drum and disk storage, costs much less than an ordinary, 100 percent real memory of the same size.
PART IIIIn Part I we invented a game whose rules paralleled the real rules of memory paging; in Part II we gave the real rules. In this part we'll discuss the implications of those rules for programmers who are using a paging machine.
When writing programs, there is only one rule that need be followed:
LOCALIZE YOUR REFERENCES.What does this mean? It means that you should write your programs so that they refer to storage in as contiguous and non-scattered a way as possible. Let's take a pair of simple FORTRAN examples. We'll exaggerate the severity of the problem for the sake of explanation, then consider a more realistic example later.
Both of these programs do the same thing; Program B does it more efficiently. Why? Let's look at the array X in the context of paging. X is 2000 words long; each word takes 4 bytes, so X requires 8000 bytes of storage. Since each page is 4096 bytes of virtual memory, X takes a bit less than 2 pages. Since the elements of arrays like X are arranged in subscript order, X(l) and X(2) will fall in one page, while X(1999) and X(2000) will fall in another. We can draw a little map of X in storage:
[Note: In November 2012 the two storage maps that follow were adjusted slightly from the 1974 version, where the maps did not seem to match the associated text.]
Now, let's see what happens when Program A runs.
Statement / Action
Action: None. DIMENSION is non-executable.X(l)=50.
Action: X(l) is set to 50. However, the page containing X(l) must be in real storage before X(l) can be referenced. If the page happens to be on the paging device, it must first be read into real storage or "paged-in."X(1999)=40.
Action: X(1999) is set to 40. However, since X(1999) is not in the same page as X(l), X(1999)'s page may first have to read from a paging device.X(2)=30.
Action: X(2) is set to 30. X(2) is in the same page as X(l); unfortunately, since some time has elapsed since X(l) was referenced, X(2)'s (and X(l)'s) page may have been "paged-out" to a paging device again. Remember that we have a timesharing system in which all jobs take turns using the same (or parts of the same) real storage, just as all the players in the crating game use the same workshop. Consequently, the whole paging process is more active than your job alone would have it be. So X(2)'s page might have to be reread from a paging device into real memory.X(2000)=20.
Action: X(2000) is set to 20. X(2000) is in the same page as X(1999). Unfortunately, just as for X(2)'s and X(l)'s page, X(2000)'s page might have to be paged-in.STOP
Action: Program terminates.END
Now we'll trace Program B.
Action: X(l) is set to 50. Like Program A, X(l)'s page may have to be read from a paging device.X(2)=30.
Action: X(2) is set to 30. Unlike Program A, X(2)'s page will almost certainly not have to be reread, since it was referenced only an instant ago.X(1999)=40.
Action: X(1999) is set to 40. Like Program A, X(1999)'s page may well have to be read from a paging device.X(2000)=20.
Action: X(2000) is set to 20. Like X(l)'s page, X(2000)'s page probably won't need to be reread.STOP
Action: Program terminates.END
Program B is better than Program A because it tends to require a smaller number of page-reads. This is true because Program B localizes its references; it does all its work on one page before moving on to the next, whereas Program A bounces its references back and forth from page to page.
"So what?" someone says. "Page-reads are free, aren't they?" No, we say. Although the system doesn't charge for page-reads per se, each page-read uses a certain amount of CPU time, part of which you are charged for. Also, page-reads increase the elapsed time of a run by making the system (which operates much faster than the paging device) suspend your job until your page can be recalled into main memory from the paging device.
As we said, the example we've used isn't quantitatively accurate — in actual fact, Program A probably would have needed no more page-reads than Program B. The four statements would probably have been executed so quickly that once both pages came into real memory, either program — A or B —would have finished before the vagaries of the paging mechanism found it necessary to put any pages back onto the paging device.
However, for large arrays, these subtleties can grow into expensive disasters. How does this happen? Let's look at a more true-to-life example.
These two programs were compiled and run by your author. Program D (the right way) took 4.12 seconds of CPU time and required 17 page-reads. Program C was stopped before it had finished ; when it was stopped, at about 1/8 toward completion, it had used 0.50 seconds of CPU time and had needed more than 3560 page-reads! The test was made on the 360/67 under conditions of moderately heavy loading. In a very busy time, the difference would have been greater; in a slack time, the difference would have been less. Why? Let's look at how these programs referenced their data.
Multi-dimensioned arrays in FORTRAN are stored internally in column-major order. This means that the array X in Programs C and D would have been stored in the following order:
That is, the array elements for column 1 (for all rows) are stored at the beginning addresses of the array; next, in ascending addresses, are all of the elements of column 2, and so forth. The elements of column 250 occupy the highest addresses in the region of storage allocated to the array X. Note that the total size of this array is 1,000,000 bytes— four bytes for each of 250,000 array elements. This amounts to just about one-half of the total real memory if the entire array were paged-in. It is also worthwhile noting that each column requires 4000 bytes, or nearly a whole page. Consequently, the first elements (i.e., the row-1 elements) of consecutive columns will be found at intervals of 4000 bytes. Thus, the chances are very good that the row-1 element of one column will be located on a different page from the row-1 element of the next column. It should be apparent that this argument is valid for rows 2, 3, and all the remaining rows of the array.
Programs C and D reference the array in the following respective orders:
Comparing these two orders with the order of the array shows that Program C references the array in a scattered way, skipping 999 array elements between references, while Program D references the array consecutively. Program D obeys the rule. Vive la difference!
The general rule for n-dimensioned FORTRAN array referencing is that the leftmost subscript should vary most rapidly, the next leftmost subscript should vary next most rapidly, etc.
In PL/1, arrays are stored internally in row-major order—the opposite of FORTRAN—so PL/1 arrays should be referenced with the rightmost subscript varying most rapidly, the next rightmost subscript next most rapidly, etc.
In general, you should try to program according to The rule whenever you can. For many small programs, the difference will be minimal; however, small programs can have large arrays! Programs C and D, after all, are "small" in the sense that they have few instructions. If you have a program that is inexplicably expensive to run, perhaps it is violating the rule too much or too often.
Remember, you can find out the total number of page-reads used by your job—they are printed in the accounting information (the "tailsheet") after each signoff. Moreover, if you issue the command $SET TDR=ON, the CPU time and number of page-reads used will be printed after the execution of each command. Try it! If a program is doing thousands of page-reads, it may well be wasting computing funds.
After all this, you might wonder why we use the paging system if people have to worry about it. The reason is that we have, in MTS, a timesharing computer system in which jobs can have up to about 8,000,000 bytes of storage whenever they need it. As we outlined in Part II, the cost of an equivalent system using non-paged memory would be outrageous. We just couldn't do it. So, in the language of Part I, even that most benevolent of monarchs, the Thing King, is imperfect; to keep ourselves happy, we must take his faults into account. ••