Phantom Assignments Not Counting Leap

The last day of 2008 was the day the world stood still for Microsoft Zune music players. First-generation Zunes—those with 30-gigabyte disk drives—went silent everywhere on December 31. The cause was soon traced to calendrical code in the device’s firmware. The bug is an interesting one, if only because all the details, including the source code, immediately came to light. Explanations were posted in the Zuneboards forum, at Ars Technica, at ProgramPhases.com and elsewhere. A copy of the C source module (which bears copyright notices from both Microsoft and Freescale Semiconductor) was posted here.

When a high-profile bug like this one comes along, it’s tempting to heap abuse and scorn on the hapless programmers who released the faulty code into the wild. And I do think this is an error that should have been caught in testing or during a code review. On the other hand, I’ve made so many mistakes myself over the years that I’m reluctant to give others a hard time. And I think the main lesson from this incident is not the usual refrain that Microsoft is lame. Rather it’s the even more commonplace observation that software is hard. Subtle traps lie in wait even in what seems a very simple algorithm.

Here’s the code that caused all the trouble:

year = ORIGINYEAR; while (days > 365) { if (IsLeapYear(year)) { if (days > 366) { days -= 366; year += 1; } } else { days -= 365; year += 1; } }

The purpose of this routine is to calculate the current year. On entry, the variable is equal to the number of days since January 1, 1980, which is taken as the beginning of time. On exit, the variable is supposed to hold the correct year number. (The global constant is set to 1980. The predicate does just what you would guess.)

The intent of the code is clear enough: We’ll repeatedly decrement the days count—by 365 in common years and by 366 in leap years—and each time we do that, we’ll increment the year number by 1. At the end of the process, the year count should have reached the correct current value. But of course I wouldn’t be writing this if the program always gave the right answer. Can you spot the problem?

Consider what happens when the initial value of is 366. Since this number is greater than 365, we pass the test at the head of the loop and enter that block. Since the initial year is 1980, the predicate returns true, and we enter the consequent clause. Now we try the test , which is false, and so we do not decrement or increment . In fact, we do nothing at all, because there’s no clause attached to this statement. We simply return to the head of the loop, where we find that is still greater than 365, and so we go through the same motions again, and again, ad infinitum. This is just what happened on December 31, 2008, which was day 10,593 in Zune time. The bug would be triggered on the last day of any leap year; it wasn’t observed before now simply because the Zune didn’t yet exist the last time February had 29 days.

A reader at Zuneboards.com quickly suggested a fix: Replace the test with . As another reader pointed out, this is incorrect. The change eliminates the infinite loop, but the altered routine returns the wrong year on December 31 of every leap year. The reader who spotted this error suggested a different solution, adding an clause with a command:

year = ORIGINYEAR; while (days > 365) { if (IsLeapYear(year)) { if (days > 366) { days -= 366; year += 1; } else { break; } } else { days -= 365; year += 1; } }

Still another reader offered this option:

year = ORIGINYEAR; while (days > 365) { if (IsLeapYear(year)) { if (days > 366) { days -= 366; year += 1; } else if (days == 366) days -= 366; } else { days -= 365; year += 1; } }

As far as I can tell from running a few test cases, both of these versions yield correct results, but their logic is anything but perspicuous. Trying to verify correctness by ure reason—as opposed to trial-and-error testing—looks almost hopeless. And then there’s the matter of aesthetics: In my opinion, all three programs are just plain ugly. A task this simple shouldn’t need so many ifs and elses.

Or so I thought. After spending some hours futzing about with the problem, and finding awful bugs in several of my own attempts, I’m not so sure there’s a simple and satisfying solution. Here’s my best effort so far:

year = ORIGINYEAR - 1; while (days > 0) { year += 1; if (IsLeapYear(year)) days -= 366; else days -= 365; }

I suppose I consider this an improvement over the alternatives given above, but I don’t really like it. What annoys me most is the trick of initially setting to . This violates an implicit assumption that will always be equal to or greater than 1980. In some programming environments, we could even make that assumption into an enforceable assertion, in which case this code would break.

To avoid introducing the phantom year 1979, we could do it this way:

year = ORIGINYEAR; while (days > 0) { if (IsLeapYear(year)) days -= 366; else days -= 365; if (days > 0) year += 1; }

But, to my taste, repeating the test is even more offensive than starting the program before the beginning of time. And both of these routines can leave the variable with a negative value, which is also nonsensical.

So I put the question: Is there a better way? Can anyone come up with the One True Algorithm for calculating years from elapsed days?

Three further notes:

1. In many counting problems there’s doubt about how to begin. Is the first day of January in 1980 to be interpreted as day 0 or as day 1? Nothing I could find in the original code resolves this ambiguity. When I first looked into the matter, I guessed that counting began with day 0 because the Zune routine correctly returns year 1980 for day values from 0 through 365. But if the zero-based assumption were correct, the Zunes would have died on January 1, not December 31. Evidently, we’re counting from 1.

2. For the record, I haven’t actually tested any of the C code above. I’ve been translated to and from Lisp. Here’s the Lisp version of the 1979 routine:

(defun my-zune-year (day-number) (let ((year 1979) (day day-number)) (loop while (> day 0) do (incf year) (if (leap-year-p year) (decf day 366) (decf day 365))) year))

3. Elsewhere in the Zune firmware, there’s another routine that raises questions about programming style and practice. Here’s the implementation of the function:

static int IsLeapYear(int Year) { int Leap; Leap = 0; if ((Year % 4) == 0) { Leap = 1; if ((Year % 100) == 0) { Leap = (Year%400) ? 0 : 1; } } return (Leap); }

The code implements the rules of the Gregorian calendar: A year is a leap year if it is divisible by 4, unless it is divisible by 100 but not by 400. So far so good. Elsewhere in the program, however, we find these declarations:

#define ORIGINYEAR 1980 // the begin year #define MAXYEAR (ORIGINYEAR + 100) // the maxium year

Thus it turns out the Zune has a fixed lifespan of 101 years, from January 1, 1980, through December 31, 2080. Throughout that interval, leap years can be detected by the simple test , without worrying about centuries. (Many systems have been designed to work from 1901 through 2099 precisely in order to take advantage of this shortcut.) Given the declaration, should the Zune have adopted a simpler leap year algorithm? Or have the authors done the right thing by putting the full logic into the code, just in case someone later decided to extend the deadline past 2100?

This entry was posted in computing.

If you have trouble getting reliable hand tracking data in your application, consider the following:

  • Check the Leap Motion icon in the notification area of the Windows Taskbar or Mac Finder bar. The icon will display green if the Leap Motion software is working properly and turn red if errors have occurred. Open the Leap Motion log window and check for errors

  • Run the diagnostic tests using the Leap Motion Control Panel Troubleshooting page

  • Check whether the Leap Motion Controller and software work outside of your application. Run the Leap Motion Visualizer or included sample application to see if other applications can receive motion tracking data

  • Make sure that all libraries required to receive Leap Motion tracking data are in the correct directories or library search paths. Refer to the project setup or “Hello world” articles for the language in which your application is written for details

  • Plug the Leap Motion USB cable directly into your computer (not a USB hub). Remove other USB devices. If another high-bandwidth device is running on the same USB bus, there may not be enough bandwidth for both devices

  • Rule out lighting issues by moving to an environment with standard indoor lighting conditions. That is, one with artificial light at a comfortable reading intensity

  • If the Leap Motion software does not appear to be producing data, stop the Leap Motion service or daemon and restart it. Unplug the Leap Motion Controller and plug it back in

    On Windows, run the following commands as Administrator:

    On Mac, run the following commands:

    On Linux, run the following commands:

  • If the Leap Motion log reports repeated disconnects, switch to the Low Resource operating mode. This can indicate congestion on the USB bus.

  • When using Microsoft Visual Studio, if you do not get tracking data when you compile your application in debug mode, but do get tracking data when you compile in release mode, then make sure you are linking with for the debug configuration of the project.

  • Ensure that ports 6437, 6438, and 6439 on the localhost domain are not blocked by a firewall or other program.

  • 0 comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *