Embedded Systems

Processor Detection Schemes

By Richard Leinecker, June 01, 1993

Knowing the processor--286, 386, or 486--means your program can include processor-specific code to improve application performance. Bob Moote adds Pentium-detection code, while Steve Heller discusses 80486 cache detection.

JUN93: PROCESSOR DETECTION SCHEMES

You don't have to write for the least-common denominator

Rick was formerly the director of technology at IntraCorp, a company specializing in entertainment software. Some of his recent titles include GrandMaster chess, BridgeMaster, and Trump Castle 3. Rick was on the staff at COMPUTE magazine and still writes a column and various features. He recently released Graphics Guru, a graphics library that most certainly uses processor-specific code. You can reach him through the DDJ offices; or on CompuServe at 74676,457.

While many applications really require the power of 80386 PCs, we can't assume (or demand) that users of our software have 386-class machines. Consequently, our programs have to support everything from 8088s to Pentiums. This means writing for the least-common denominator--and hoping Pentium users won't notice the difference.

However, it is possible to write processor-specific code, taking advantage of each processor's strengths using the techniques I present here. The hard part is finding out which processor, or even which version of a processor, you're running on; the easy part is knowing what to do. For example, my assembly code usually has vector tables for each processor, and once I've identified the CPU, I copy the appropriate vectors into a global table. From then on, it's completely transparent--you simply make calls based on the vector table.

There are few steps to finding out what CPU is under the hood. To get started, I check bits 12-15 of the flags register. If they're always set, regardless of how you alter them, you've got an 8088 or an 8086. Then, I use a self-modifying code technique to find out if the prefetch queue is 4 or 6 bytes. If it's 4, the chip is an 8088; 6 means it's an 8086.

To distinguish between a 286 and 386/486s, you need to check for 386/486 flags that are undefined for the 286. If it's a 386 or better, the next step is to check the AC bit in the flags. If it can be set, you've got a 486; if it remains unset, it's a 386.

Using Bits 12-15 of the Flags

Determining whether a processor is an 8088 or 8086 simply requires checking bits 12-15 of the flags register. Although you can't move the flags directly to a word register, you can push the flags and pop them into a register. To move the value back into the flags, reverse the operation.

To test bits 12-15, mask them off by ANDing the working register with Offfh. The value must be moved into the flags, then back into the working register. If bits 12-15 are all set, the processor is an 8088 or an 8086. The; Is It an 8088? segment of Listing One (page 126) illustrates this.

Differentiating between 8088s and 8086s is trickier. The easiest way I've found to do it is to modify code that's five bytes ahead of IP. Since the prefetch queue of an 8088 is four bytes and the prefetch queue of an 8086 is six bytes, an instruction five bytes ahead of IP won't have any effect on an 8086 the first time around. See the; Is It an 8086? segment in Listing One.

Differentiating between the 286, 386, and 486

The easiest way to determine whether or not the CPU is a 80286 is to set the NT (nested task flag) and IOPL (I/O privilege level) bits in the flags register, then check to see if they're still set. These bits are undefined for 286s and will hold 0s no matter what you do. Since you can't set the flag bits directly, you have to use ax and the stack to indirectly set the bits. The ; Is It an 80286? part of Listing One describes this.

More Details.

Actually, it turns out there's one case where IsItA286 returns a bogus value, indicating a 286 when in fact you're running on a 386. This is due to an obscure bug in Windows 3.1, as Bob Moote of Phar Lap Software has graciously pointed out. In a DOS box under Windows 3.1 enhanced mode, the pushf and popf instructions are virtualized if the DOS box is running at IOPL 0. According to Bob, when you execute the virtualized pushf, Windows doesn't remember that you tried to set those bits, and it returns the real hardware bits, which are all 0s. Consequently, you think you're on a 286 when it's really a 386 (or later). Keep in mind that this only happens under Windows 3.1 once out of every several thousand runs because Windows doesn't use IOPL 0 that often. Phar Lap gets around this problem in its 386|DOS-Extender (once they know they're on at least a 286) by doing an additional test before trying the flag-setting test. They use the SMSW instruction to check if the PE (protected mode) bit is set; if so, it's in V86 mode and must be a 386 or later. If the PE bit indicates real mode, then they do the flag-setting test to see if it's a 286.

If you know you've got a 386 or higher, you have to look for a 486. The easiest way I've found to do this is to set the 486 AC (alignment check) bit in the flags and see if it remains set. If it does, it's a 486. The code segment in Listing One labeled; Is It an 80386 or 80486? shows how to do this.

For a Pentium-detection scheme, see the accompanying text box entitled "Pentium Detection." For a cache-detection technique, see the text box "486 Cache Detection."

Conclusion

Now you have no more excuses. Write the best code for the processor you've got, and give users their money's worth. Supporting 386s and higher is a solution, but not one nearly as professional as multiprocessor support.

More Details.

486 Cache Detection

Steve Heller

Steve is the author of Large Problems, Small Machines (Academic Press, 1992) and president of Chrysalis Software, P.O. Box 0335, Baldwin, NY 11510. He can be contacted on CompuServe at 71101, 1702.

While Intel's 486 has an internal cache of 8 Kbytes, 486 clones from other manufacturers have internal caches of varying sizes. Cyrix's 486SLC, for instance, has a 1K cache, while IBM's 486SLC has one that's 16 Kbytes in size. Furthermore, most 486-based PCs also have additional external caches. The program presented here (see Listing Three, Page, 127) detects whether internal or external (or both) caches are active so that you can turn them on or off, if necessary.

This approach takes advantage of the fact that jumping to an instruction that isn't aligned on a doubleword boundary is time consuming when there's no caching, as the processor has to access additional memory words to extract the next instruction to be executed. The program has two loops: one that executes an indirect jump to a properly aligned target, and a second that executes the same jump to a misaligned target. If the 486/66 DX2 internal cache is active, the two loops take almost identical amounts of time. If only the external cache is active, the misaligned loops takes about 50 percent longer to execute than the aligned one. If neither is active, the misaligned loop takes about twice as long to execute as the aligned one. Similar results occur on a 486/33 and on a Cyrix 486SLC/50. Even a 386/40 DX with an external cache displays a noticeable effect, although of much smaller magnitude. Keep in mind that if the internal cache is active, this program cannot determine the status of the external cache, as it is completely masked by the internal caching.

The program was compiled using Borland C++ 3.1 and should be straightforward to port to other compilers as long as they allow inline assembly language. However, any changes must not disturb the alignment of the first loop and the misalignment of the second.

Pentium Detection

Robert Moote

Robert is a cofounder of Phar Lap Software and can be contacted at 60 Aberdeen Ave., Cambridge, MA 02138.

To determine whether or not the target system is Pentium based, first ascertain that you are at least on an i486 chip, as Rick describes. Then call the detect routine in Listing Two (page 127), which returns 4 if the processor is an i486, 5 or greater if the processor is a Pentium or later chip.

In the Pentium chip, Intel finally designed in a software-detection method by adding the CPUID instruction for obtaining information about the chip family, stepping, and unique features, and by adding an ID bit to the EFLAGS register to ascertain whether the processor supports the CPUID instruction. The code first attempts to toggle the ID bit in the EFLAGS register; the Pentium processor is the first in the 80x86 family to allow this bit to be set to both 1 and 0.

The CPUID instruction takes an input value in the EAX register. A 0 in EAX returns the highest input value recognized by CPUID in EAX (this is 1 for the Pentium chip), and a vendor identification string in EBX-EDX (GenuineIntel for Intel's chip). Passing a 1 in EAX returns the chip information in EAX-EDX; this is the form of CPUID used in the detect code. On the Pentium, the chip family, model, and stepping are returned in EAX, and some bit flags identifying processor features are returned in EDX. The detect routine returns the chip family code, taken from bits 8-11 of EAX. The value 5 is used to identify the Pentium chip family; if Intel takes the logical step of incrementing this ID value for each subsequent chip in the 80x86 processor line, this is the last 80x86 detect routine you'll have to write.

Of course, the code presented here requires an assembler that supports Pentium opcodes. Note that the .586 directive is how you turn on Pentium support in the Phar Lap assembler; other assemblers may use different directives, since the chip isn't called a 586. You can also code the detect routine with any assembler just by using the DB directive to hardcode the opcode for CPUID.



_PROCESSOR DETECTION SCHEMES_
by Richard C. Leinecker

[LISTING ONE]

<a name="0168_000c">

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;  Detect the Processor Type -- by Richard C. Leinecker                 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

_PTEXT      SEGMENT PARA PUBLIC 'CODE'
        ASSUME  CS:_PTEXT, DS:_PTEXT

        public  _Processor
; This routine returns the processor type as an integer value.
; C Prototype
; int Processor( void );
; Returns: 0 == 8088, 1 == 8086, 2 == 80286, 3 == 80386, 4 == 80486
; Code assumes es, ax, bx, cx, and dx can be altered. If their contents
; must be preserved, then you'll have to push and pop them.

_Processor  proc    far

        push    bp          ; Preserve bp
        mov bp,sp           ; bp = sp

        push    ds          ; Preserve ds
        push    di          ; and di

        mov ax,cs           ; Point ds
        mov ds,ax           ; to _PTEXT

        call    IsItAn8088      ; Returns 0 or 2

        cmp al,2            ; 2 = 286 or better
        jge AtLeast286      ; Go to 286 and above code

        call    IsItAn8086      ; Returns 0 or 1

        jmp short ExitProcessor ; Jump to exit code
AtLeast286:
        call    IsItA286        ; See if it's a 286

        cmp al,3            ; 4 and above for 386 and up
        jl  short ExitProcessor ; Jump to exit code
AtLeast386:
        call    IsItA386        ; See if it's a 386
ExitProcessor:
        pop di          ; Restore di,
        pop ds          ; ds,
        pop bp          ; and bp

        ret             ; Return to caller
_Processor  endp

; Is It an 8088?
; Returns ax==0 for 8088/8086, ax==2 for 80286 and above
IsItAn8088  proc
        pushf               ; Preserve the flags

        pushf               ; Push the flags so
        pop bx          ; we can pop them into bx

        and bx,00fffh       ; Mask off bits 12-15
        push    bx          ; and put it back on the stack

        popf                ; Pop value back into the flags
        pushf               ; Push it back. 8088/8086 will
                                                ; have bits 12-15 set
        pop bx          ; Get value into bx
        and bx,0f000h       ; Mask all but bits 12-15

        sub ax,ax           ; Set ax to 8088 value
        cmp bx,0f000h       ; See if the bits are still set
        je  Not286          ; Jump if not

        mov al,2            ; Set for 286 and above
Not286:
        popf                ; Restore the flags

        ret             ; Return to caller
IsItAn8088  endp

; Is It an 8086?
; Returns ax==0 for 8088, ax==1 for 8086
; Code takes advantage of the 8088's 4-byte prefetch queues and 8086's
; 6-byte prefetch queues. By self-modifying the code at a location exactly 5
; bytes away from IP, and determining if the modification took effect,
; you can differentiate between 8088s and 8086s.
IsItAn8086  proc

        mov ax,cs          ; es == code segment
        mov es,ax

        std            ; Cause stosb to count backwards

        mov dx,1           ; dx is flag and we'll start at 1
        mov di,OFFSET EndLabel ; di==offset of code tail to modify
        mov al,090h        ; al==nop instruction opcode
        mov cx,3           ; Set for 3 repetitions
        REP stosb          ; Store the bytes

        cld            ; Clear the direction flag
        nop            ; Three nops in a row
        nop            ; provide dummy instructions
        nop
        dec dx         ; Decrement flag (only with 8088)
        nop            ; dummy instruction--6 bytes
EndLabel:
        nop

        mov ax,dx          ; Store the flag in ax

        ret            ; Back to caller
IsItAn8086  endp

; Is It an 80286?
; Determines whether processor is a 286 or higher. Going into subroutine ax = 2
; If the processor is a 386 or higher, ax will be 3 before returning. The
; method is to set ax to 7000h which represent the 386/486 NT and IOPL bits
; This value is pushed onto the stack and popped into the flags (with popf).
; The flags are then pushed back onto the stack (with pushf). Only a 386 or 486
; will keep the 7000h bits set. If it's a 286, those bits aren't defined and
; when the flags are pushed onto stack these bits will be 0. Now, when ax is
; popped these bits can be checked. If they're set, we have a 386 or 486.
IsItA286    proc
        pushf               ; Preserve the flags
        mov ax,7000h        ; Set the NT and IOPL flag
                        ; bits only available for
                        ; 386 processors and above
        push    ax          ; push ax so we can pop 7000h
                        ; into the flag register
        popf                ; pop 7000h off of the stack
        pushf               ; push the flags back on
        pop ax          ; get the pushed flags
                        ; into ax
        and ah,70h          ; see if the NT and IOPL
                        ; flags are still set
        mov ax,2            ; set ax to the 286 value
        jz  YesItIsA286     ; If NT and IOPL not set
                        ; it's a 286
        inc ax          ; ax now is 4 to indicate
                        ; 386 or higher
YesItIsA286:
        popf                ; Restore the flags

        ret             ; Return to caller
IsItA286    endp

; Is It an 80386 or 80486?
; Determines whether processor is a 386 or higher. Going into subroutine ax=3
; If the processor is a 486, ax will be 4 before leaving. The method is to set
; the AC bit of the flags via EAX and the stack. If it stays set, it's a 486.
IsItA386    proc
        mov di,ax           ; Store the processor value
        mov bx,sp           ; Save sp
        and sp,0fffch       ; Prevent alignment fault
 .386
        pushfd              ; Preserve the flags

        pushfd              ; Push so we can get flags
        pop eax         ; Get flags into eax
        or  eax,40000h      ; Set the AC bit
        push    eax         ; Push back on the stack
        popfd               ; Get the value into flags
        pushfd              ; Put the value back on stack
        pop eax         ; Get value into eax
        test    eax,40000h      ; See if the bit is set
        jz  YesItIsA386     ; If not we have a 386

        add di,2            ; Increment to indicate 486
YesItIsA386:
        popfd               ; Restore the flags
 .8086
        mov sp,bx           ; Restore sp
        mov ax,di           ; Put processor value into ax

        ret             ; Back to caller
IsItA386    endp

_PTEXT      ENDS
        END




<a name="0168_000d">
<a name="0168_000e">

[LISTING TWO]

<a name="0168_000e">

.586
; Pentium detect routine. Call only after verifying processor is an i486 or
; later. Returns 4 if on i486, 5 if Pentium, 6 or greater for future
; Intel processors.
EF_ID   equ   200000h      ; ID bit in EFLAGS register
Pentium proc near

; Check for Pentium or later by attempting to toggle the ID bit in EFLAGS reg;
; if we can't, it's an i486.
   pushfd         ; get current flags
   pop   eax         ;
   mov   ebx,eax         ;
   xor   eax,EF_ID   ; attempt to toggle ID bit
   push   eax         ;
   popfd            ;
   pushfd         ; get new EFLAGS
   pop   eax         ;
   push   ebx      ; restore original flags
   popfd            ;
   and   eax,EF_ID   ; if we couldn't toggle ID, it's an i486
   and   ebx,EF_ID      ;
   cmp   eax,ebx         ;
   je   short Is486      ;

; It's a Pentium or later. Use CPUID to get processor family.
   mov   eax,1      ; get processor info form of CPUID
   cpuid            ;
   shr   ax,8      ; get Family field;  5 means Pentium.
   and   ax,0Fh         ;
   ret
Is486:
   mov   ax,4      ; return 4 for i486
   ret
pentium endp




<a name="0168_000f">
<a name="0168_0010">

[LISTING THREE]

<a name="0168_0010">

#pragma inline
main()
{
    long    start1;
    long end1;
    long    start2;
    long end2;
    start1 = clock();
asm     P386
asm     mov eax,10000000
asm     lea ebx,loop1
asm     loop1:
asm     dec eax
asm     jz  loop1e
asm     jmp ebx
loop1e:
    start2 = end1 = clock();
asm     mov eax,10000000
asm     lea ebx,loop2
asm     nop
asm     loop2:
asm     dec eax
asm     jz  loop2e
asm     jmp ebx
loop2e:
    end2 = clock();

   printf("misaligned loop time = %d, aligned loop time =%d\n",
    (int)(end1-start1), (int)(end2-start2));
    return;
}

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Embedded Systems

Processor Detection Schemes

Using Bits 12-15 of the Flags

Differentiating between the 286, 386, and 486

Conclusion

486 Cache Detection

Pentium Detection

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Embedded Systems Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Embedded Systems

Processor Detection Schemes

Related Reading

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Embedded Systems Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content