rdtsc's return value is _always_ mod 10 == 0 on Atom N450

On my E8200 box this doesn't occur, but on my Atom N450 netbook (both running OpenSuse 11.2), whenever I read the CPU's TSC, the returned value is mod 10 == 0, i. e. it is without remainder divisible by 10. I'm using the RDTSC value for measuring times that interesting pieces of code take, but for the purpose of demonstration I've made up this little program:

        .global _start

_start: xorl    %ebx,%ebx
        xorl    %ecx,%ecx
        xorl    %r14d,%r14d
        movb    $10,%cl
loop:   xchgq   %rcx,%r15          # save to reg
        shlq    $32,%rdx
        xorq    %rax,%rdx          # full 64 bit of RDTSC
        movq    %r14,%r13          # save the old value
        movq    %rdx,%r14          # copy current
        movq    %r14,%rsi          #  argv[1] of printf()
        subq    %r13,%rdx          #  argv[2] (delta)
        leaq    format(%rip),%rdi  #  argv[0]
        xorl    %eax,%eax          #  no stack varargs
        call    printf
        xchgq   %rcx,%r15
        loop    loop

0:      xorl    %eax,%eax
        movb    $0x3c,%al

        .size   _start, .-_start

format: .asciz     "rdtsc: %#018llx = %1$llu -- delta: %llu\n"

(I usually use my own routines for converting, but to prevent readers from suggesting that the error might be there, I'm just using printf() here.)

With the above code, the output is (for example):

rdtsc: 0x000b88ef933ffd06 = 3246787292822790 -- delta: 3246787292822790
rdtsc: 0x000b88ef9342fcf4 = 3246787293019380 -- delta: 196590
rdtsc: 0x000b88ef93435dca = 3246787293044170 -- delta: 24790
rdtsc: 0x000b88ef9343b43c = 3246787293066300 -- delta: 22130
rdtsc: 0x000b88ef93440c34 = 3246787293088820 -- delta: 22520
rdtsc: 0x000b88ef9344604e = 3246787293110350 -- delta: 21530
rdtsc: 0x000b88ef9344b4d6 = 3246787293131990 -- delta: 21640
rdtsc: 0x000b88ef9345085a = 3246787293153370 -- delta: 21380
rdtsc: 0x000b88ef93455d96 = 3246787293175190 -- delta: 21820
rdtsc: 0x000b88ef9345b16a = 3246787293196650 -- delta: 21460

As can be easily seen, the delta varies in reasonable amounts. But conspicuous (not to say conspired ;-) is that the least significant decimal digit is always 0.

I've observed this phenomenon for more than two years now, and Stack Overflow is not the first address where I make this issue public. But nowhere I got a reasonable answer yet. The ideas we (me and other people out there) came up with, are that

  • the TSC is incremented only every 10th cycle, but then by 10, or
  • the TSC is internally updated correctly, but reflected to the outside only every 10th cycle, or
  • the TSC is incremented by 10 each cycle.

None of these points really make sense, however. I should have actually run a program like that on the E8200 (which is currently out of order) to see if the order of magnitude of the deltas is the same or only a tenth of those in the above output. (Any volunteers?)

Googling didn't help, Intel's manuals did neither.

When discussing with other people, there was no-one else who experienced the same behaviour. If it had to do with the kernel, then at least 3 versions were affected, but then... what does the kernel have to do with it?

I've also had the netbook in service, and it came back with a new motherboard — implied a new CPU, so at least two individual entities of N450 must be affected.

I've also took measures against clock frequency changes (and no matter what frequency I fixed the clock to, the values varied only in the expected range (the same as shown)), and switched off HT, though these should actually help to get some other least significant digits, rather than preventing them. But just to be sure.

Well, if anyone wants to run the program on their machine, the command line is (provided you save the source in a file rdtsc.s):

as rdtsc.s -o rdtsc.o
ld --dynamic-linker=/lib64/ld-linux-x86-64.so.2 rdtsc.o -L /lib64 -l c -o rdtsc

In order to build it with the gcc frontend, i. e.

gcc -l c rdtsc.s -o rdtsc

you must add (or replace the _start: label with) a main: label and make it global.

[update (2012-09-15 ~21:15 UTC): Actually I could also have done this before: I just let it take the TSC before and after a sleep(1), which gives a delta slightly greater than 1,666,000,000, which shows that the third point in the list above is wrong. But still I have no idea why I don't get the full precision. /update]


Volume 3B of the Software Development Manual says this:

... for Intel Atom processors ... the time-stamp counter increments at a constant rate. That rate may be set by the maximum core-clock to bus-clock ratio of the processor or may be set by the maximum resolved frequency at which the processor is booted. The maximum resolved frequency may differ from the maximum qualified frequency of the processor, ...

That doesn't completely answer why you're seeing specifically steps of 10, but it does point out that a specific implementation is free to increment by something other than 1. I suspect you'd have to look quite a bit closer at the specific hardware specs of your machine and the BIOS implementation to discover why it's exactly 10.


Your computer's BIOS doesn't support CPU underclocking.

So your PLL run under constant ratio.

The ratio can't be different because clock rate ratio for your Atom N450 is 10.