ARC Firmware Emulation in NTLDR

Basis: All boot loader functions use the standard Arc firmware functions (prefixed Arc). These Arc functions are declared as macros that invoke the function pointers under SYSTEM_BLOCK->FirmwareVector available in the ARC-compliant systems. In IBM PC and its derivatives, the ARC firmware is not available and it is necessary to emulate and translate the ARC firmware calls from the boot loader functions to the standard PC BIOS functions.

Standard ARC Firmware Routine Declarations (I/O)

//
// Define I/O functions.
//

#define ArcClose(FileId) \
((PARC_CLOSE_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[CloseRoutine])) \
((FileId))

#define ArcGetReadStatus(FileId) \
((PARC_READ_STATUS_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[ReadStatusRoutine])) \
((FileId))

#define ArcMount(MountPath, Operation) \
((PARC_MOUNT_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[MountRoutine])) \
((MountPath), (Operation))

#define ArcOpen(OpenPath, OpenMode, FileId) \
((PARC_OPEN_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[OpenRoutine])) \
((OpenPath), (OpenMode), (FileId))

#define ArcRead(FileId, Buffer, Length, Count) \
((PARC_READ_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[ReadRoutine])) \
((FileId), (Buffer), (Length), (Count))

#define ArcSeek(FileId, Offset, SeekMode) \
((PARC_SEEK_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[SeekRoutine])) \
((FileId), (Offset), (SeekMode))

#define ArcWrite(FileId, Buffer, Length, Count) \
((PARC_WRITE_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[WriteRoutine])) \
((FileId), (Buffer), (Length), (Count))

#define ArcGetFileInformation(FileId, FileInformation) \
((PARC_GET_FILE_INFO_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[GetFileInformationRoutine])) \
((FileId), (FileInformation))

#define ArcSetFileInformation(FileId, AttributeFlags, AttributeMask) \
((PARC_SET_FILE_INFO_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[SetFileInformationRoutine])) \
((FileId), (AttributeFlags), (AttributeMask))

#define ArcGetDirectoryEntry(FileId, Buffer, Length, Count) \
((PARC_GET_DIRECTORY_ENTRY_ROUTINE)(SYSTEM_BLOCK->FirmwareVector[GetDirectoryEntryRoutine])) \
((FileId), (Buffer), (Length), (Count))

On x86, SYSTEM_BLOCK is not present and is redirected to GlobalSystemBlock struct:

#define SYSTEM_BLOCK (&GlobalSystemBlock)

GlobalSystemBlock is declared in arcemul.c as follows:

SYSTEM_PARAMETER_BLOCK GlobalSystemBlock =
    {
        0,                              // Signature??
        sizeof(SYSTEM_PARAMETER_BLOCK), // Length
        0,                              // Version
        0,                              // Revision
        NULL,                           // RestartBlock
        NULL,                           // DebugBlock
        NULL,                           // GenerateExceptionVector
        NULL,                           // TlbMissExceptionVector
        MaximumRoutine,                 // FirmwareVectorLength
        GlobalFirmwareVectors,          // Pointer to vector block
        0,                              // VendorVectorLength
        NULL                            // Pointer to vendor vector block
    };

The FirmwareVector field of the GlobalSystemBlock points to GlobalFirmwareVectors, which is declared as follows (also in arcemul.c):

PVOID GlobalFirmwareVectors[MaximumRoutine];

The address of the GlobalFirmwareVectors pointer is initialised in BlFillInSystemParameters function:

VOID
BlFillInSystemParameters(
    IN PBOOT_CONTEXT BootContextRecord
    )
/*++

Routine Description:

    This routine fills in all the fields in the Global System Parameter Block
    that it can.  This includes all the firmware vectors, the vendor-specific
    information, and anything else that may come up.

Arguments:

    None.

Return Value:

    None.

--*/

{
    int cnt;

    //
    // Fill in the pointers to the firmware functions which we emulate.
    // Those which we don't emulate are stubbed by BlArcNotYetImplemented,
    // which will print an error message if it is accidentally called.
    //

    for (cnt=0; cnt<MaximumRoutine; cnt++) {
        GlobalFirmwareVectors[cnt]=(PVOID)BlArcNotYetImplemented;
    }
    GlobalFirmwareVectors[CloseRoutine]  = (PVOID)AEClose;
    GlobalFirmwareVectors[OpenRoutine]  = (PVOID)AEOpen;
    GlobalFirmwareVectors[MemoryRoutine]= (PVOID)AEGetMemoryDescriptor;
    GlobalFirmwareVectors[SeekRoutine]  = (PVOID)AESeek;
    GlobalFirmwareVectors[ReadRoutine]  = (PVOID)AERead;
    GlobalFirmwareVectors[ReadStatusRoutine]  = (PVOID)AEReadStatus;
    GlobalFirmwareVectors[WriteRoutine] = (PVOID)AEWrite;
    GlobalFirmwareVectors[GetFileInformationRoutine] = (PVOID)AEGetFileInformation;
    GlobalFirmwareVectors[GetTimeRoutine] = (PVOID)AEGetTime;
    GlobalFirmwareVectors[GetRelativeTimeRoutine] = (PVOID)AEGetRelativeTime;

    GlobalFirmwareVectors[GetPeerRoutine] = (PVOID)FwGetPeer;
    GlobalFirmwareVectors[GetChildRoutine] = (PVOID)FwGetChild;
    GlobalFirmwareVectors[GetParentRoutine] = (PVOID)AEGetParent;
    GlobalFirmwareVectors[GetComponentRoutine] = (PVOID)FwGetComponent;
    GlobalFirmwareVectors[GetDataRoutine] = (PVOID)AEGetConfigurationData;
    GlobalFirmwareVectors[GetEnvironmentRoutine] = (PVOID)AEGetEnvironment;

    GlobalFirmwareVectors[RestartRoutine] = (PVOID)AEReboot;
    GlobalFirmwareVectors[RebootRoutine] = (PVOID)AEReboot;

}

All functions assigned to the GlobalFirmwareVectors are implemented in arcemul.c. Lets take a look at the AERead function (ReadRoutine = ArcRead) as an example:

ARC_STATUS
AERead (
    IN ULONG FileId,
    OUT PVOID Buffer,
    IN ULONG Length,
    OUT PULONG Count
    )

/*++

Routine Description:

    Reads from the specified file or device

Arguments:

    FileId - specifies the file to read from

    Buffer - Address of buffer to hold the data that is read

    Length - Maximum number of bytes to read

    Count -  Address of location in which to store the actual bytes read.

Return Value:

    ESUCCESS - Read completed successfully

    !ESUCCESS - Read failed.

--*/

{
    ARC_STATUS Status;
    ULONG Limit;
    ULONG PartCount;
    //
    // Special case for console input
    //

    if (FileId == 0) {
        return(BiosConsoleRead(FileId,Buffer,Length,Count));
    } else {

        *Count = 0;

        do {

            if (((ULONG) Buffer & 0xffff0000) !=
               (((ULONG) Buffer + Length) & 0xffff0000)) {

                Limit = 0x10000 - ((ULONG) Buffer & 0x0000ffff);
            } else {

                Limit = Length;

            }

            Status = (BlFileTable[FileId].DeviceEntryTable->Read)( FileId,
                                                                Buffer,
                                                                Limit,
                                                                &PartCount  );
            *Count += PartCount;
            Length -= Limit;
            (PCHAR) Buffer += Limit;

            if (Status != ESUCCESS) {
                BlPrint("Disk I/O error: Status = %lx\n",Status);
                return(Status);
            }

        } while (Length > 0);

        return(Status);
    }
}

In short, the AERead first checks if the supplied file handle is a console (yes, console I/O is also performed through the Arc I/O operations). If the handle is not a console handle, it assumes that the handle belongs to a file and uses the BlFileTable to locate the file descriptor.

To provide a little background, there are multiple internal drivers in the NTLDR (e.g. BIOS and SCSI). All these drivers implement a common set of functions (although not enforced by an interface or abstract class-like constraint). All these drivers use the global variable BlFileTable and the common file descriptor struct to maintain the list of open devices/files.

Back to the AERead function, the function first locates the file descriptor in the BlFileTable and uses its Read function pointer in the DeviceEntryTable to perform a read operation.

Lets say that the handle belongs to a device or partition governed by the BIOS driver (biosdrv.c). At the time of AEOpen (ArcOpen), the open function would have called BiosPartitionOpen function.

The BiosPartitionOpen function checks if the supplied path belongs to a device (disk) and calls BiosDiskOpen if it does, which in turn sets the file descriptor DeviceEntryTable as BiosDiskEntryTable. If the supplied path belongs to a partition, it will set the DeviceEntryTable value as BiosPartitionEntryTable.

The following is the declaration of BiosDiskEntryTable and BiosPartitionEntryTable:

BL_DEVICE_ENTRY_TABLE BiosPartitionEntryTable =
    {
        (PARC_CLOSE_ROUTINE)BiosPartitionClose,
        (PARC_MOUNT_ROUTINE)BlArcNotYetImplemented,
        (PARC_OPEN_ROUTINE)BiosPartitionOpen,
        (PARC_READ_ROUTINE)BiosPartitionRead,
        (PARC_READ_STATUS_ROUTINE)BlArcNotYetImplemented,
        (PARC_SEEK_ROUTINE)BiosPartitionSeek,
        (PARC_WRITE_ROUTINE)BiosPartitionWrite,
        (PARC_GET_FILE_INFO_ROUTINE)BiosGetFileInfo,
        (PARC_SET_FILE_INFO_ROUTINE)BlArcNotYetImplemented,
        (PRENAME_ROUTINE)BlArcNotYetImplemented,
        (PARC_GET_DIRECTORY_ENTRY_ROUTINE)BlArcNotYetImplemented,
        (PBOOTFS_INFO)BlArcNotYetImplemented
    };

BL_DEVICE_ENTRY_TABLE BiosDiskEntryTable =
    {
        (PARC_CLOSE_ROUTINE)BiosDiskClose,
        (PARC_MOUNT_ROUTINE)BlArcNotYetImplemented,
        (PARC_OPEN_ROUTINE)BiosDiskOpen,
        (PARC_READ_ROUTINE)BiosDiskRead,
        (PARC_READ_STATUS_ROUTINE)BlArcNotYetImplemented,
        (PARC_SEEK_ROUTINE)BiosPartitionSeek,
        (PARC_WRITE_ROUTINE)BiosDiskWrite,
        (PARC_GET_FILE_INFO_ROUTINE)BiosGetFileInfo,
        (PARC_SET_FILE_INFO_ROUTINE)BlArcNotYetImplemented,
        (PRENAME_ROUTINE)BlArcNotYetImplemented,
        (PARC_GET_DIRECTORY_ENTRY_ROUTINE)BlArcNotYetImplemented,
        (PBOOTFS_INFO)BlArcNotYetImplemented
    };

So in our case, the function pointer would lead to the BiosDiskRead or BiosPartitionRead function

The BiosPartitionRead function simply obtains the parent disk file handle and uses its Read function, which would then lead to BiosDiskRead, to perform a read operation. The BiosDiskRead function, without all the other unnecessary implementation details, calls the MdGetPhysicalSectors function to read individual sectors on the BIOS disk. The following is an example of MdGetPhysicalSectors call by BiosReadDisk:

        Status = MdGetPhysicalSectors((USHORT)BlFileTable[FileId].u.DriveContext.Drive,
                                      (USHORT)head,
                                      (USHORT)cylinder,
                                      (USHORT)sector,
                                      1,
                                      LocalBuffer
                                     );

The MdGetPhysicalSectors is implemented in machine.c as follows:

ARC_STATUS
MdGetPhysicalSectors(
    IN USHORT Drive,
    IN USHORT HeadNumber,
    IN USHORT TrackNumber,
    IN USHORT SectorNumber,
    IN USHORT NumberOfSectors,
    PUCHAR PointerToBuffer
    )
{
    ARC_STATUS Status;
    int Retry;
    int MaxRetry;

//    DBG1( CHECKPOINT("MdGetPhysSec"); )

    ASSERT((ULONG)PointerToBuffer < 0x100000);

    // Note, even though args are short, they are pushed on the stack with
    // 32bit alignment so the effect on the stack seen by the 16bit real
    // mode code is the same as if we were pushing longs here.
    //

    if (NumberOfSectors == 0) {
        return(ESUCCESS);
    }

    // prevent cylinder # from wrapping

    if(TrackNumber > 1023) {
        return(E2BIG);
    }

//    MaxRetry = Drive < 128 ? FLOPPY_RETRY : HARDDISK_RETRY;
    MaxRetry = 10;

    Retry=0;
    do {

#if 0
    BlPrint("Requesting: d=%x, h=%x  t=%x  sn=%x  num=%x  buf=%lx\n",
           Drive,HeadNumber,TrackNumber,SectorNumber,NumberOfSectors,
           PointerToBuffer);
#endif

        Status = GET_SECTOR(
                    READ_SECTOR,
                    Drive,
                    HeadNumber,
                    TrackNumber,
                    SectorNumber,
                    NumberOfSectors,
                    PointerToBuffer
                    );

        if (Status) {
//            BlPrint("Error %lx from BIOS, resetting\n",Status);
            MdResetDiskSystem(Drive);
        }

    } while ( (Status) && (Retry++ < MaxRetry) );
    return Status;
}

Note that the MdGetPhysicalSectors function internally invokes GET_SECTOR function to perform the actual read operation.

GET_SECTOR is, in fact, yet another macro defined in bldrx86.h:

#define GET_SECTOR          (*ExternalServicesTable->DiskIOSystem)

ExternalServicesTable is declared in entry.c of boot\lib and initialised by the DoGlobalInitialization function, which is called by the NtProcessStartup routine:

VOID
DoGlobalInitialization(
    IN PBOOT_CONTEXT BootContextRecord
    )
{
    ARC_STATUS Status;

    Status = InitializeMemorySubsystem(BootContextRecord);
    if (Status != ESUCCESS) {
        BlPrint("InitializeMemory failed %lx\n",Status);
        while (1) {
        }
    }
    ExternalServicesTable=BootContextRecord->ExternalServicesTable;
    MachineType = BootContextRecord->MachineType;
...

 

VOID
NtProcessStartup(
    IN PBOOT_CONTEXT BootContextRecord
    )
/*++

Routine Description:

    Main entry point for setup loader. Control is transferred here by the
    start-up (SU) module.

Arguments:

    BootContextRecord - Supplies the boot context, particularly the
        ExternalServicesTable.

Returns:

    Does not return. Control eventually passed to the kernel.

--*/
{
    ARC_STATUS Status;

    //
    // Initialize the boot loader's video
    //

    DoGlobalInitialization(BootContextRecord);

    BlFillInSystemParameters(BootContextRecord);

    if (BootContextRecord->FSContextPointer->BootDrive == 0) {

        //
        // Boot was from A:
        //

        strcpy(BootPartitionName,"multi(0)disk(0)fdisk(0)");

        //
        // To get around an apparent bug on the BIOS of some MCA machines
        // (specifically the NCR 386sx/MC20 w/ BIOS version 1.04.00 (3421),
        // Phoenix BIOS 1.02.07), whereby the first int13 to floppy results
        // in a garbage buffer, reset drive 0 here.
        //

        GET_SECTOR(0,0,0,0,0,0,NULL);

#if defined(ELTORITO)
    } else if (BlIsElToritoCDBoot(BootContextRecord->FSContextPointer->BootDrive)) {

        //
        // Boot was from El Torito CD
        //

        sprintf(BootPartitionName, "multi(0)disk(0)cdrom(%u)", BootContextRecord->FSContextPointer->BootDrive);
        ElToritoCDBoot = TRUE;
#endif

    } else {

        //
        // Find the partition we have been booted from.  Note that this
        // is *NOT* necessarily the active partition.  If the system has
        // Boot Mangler installed, it will be the active partition, and
        // we have to go figure out what partition we are actually on.
        //
        BlGetActivePartition(BootPartitionName);

    }

    //
    // Initialize the memory descriptor list, the OS loader heap, and the
    // OS loader parameter block.
    //

    Status = BlMemoryInitialize();
    if (Status != ESUCCESS) {
        BlPrint("Couldn't initialize memory\n");
        while (1) {
        }
    }

    //
    // Initialize the OS loader I/O system.
    //

    Status = BlIoInitialize();
    if (Status != ESUCCESS) {
        BlPrint("Couldn't initialize I/O\n");
    }

    //
    // Call off to regular startup code
    //
    BlStartup(BootPartitionName);

    //
    // we should never get here!
    //
    do {
        GET_KEY();
    } while ( 1 );

}

NtProcessStartup routine is the entry point of the OSLOADER.EXE, which is the upper part of the NTLDR image (more background info: NTLDR image is constructed by concatenating STARTUP.COM to OSLOADER.EXE. STARTUP.COM is the first 16-bit portion of the NTLDR initialisation process and the OSLOADER.EXE is the 32-bit portion that displays the menu, boots OSes, etc.)

Note that BootContextRecord is passed from the STARTUP.COM (or by the ARC firmware, in ARC-compliant systems) and it supplies the value of the ExternalServicesTable.

The following is the last lines of the SuMain function in main.c of the STARTUP.COM, initiating transition to the OSLOADER.EXE:

    //
    // Transfer control to the OS loader
    //

    TransferToLoader(LoaderEntryPoint);

TransferToLoader function is implemented in su.asm:

_TransferToLoader proc near

;  generates a double fault for debug purposes
;        mov      sp,0
;        push 0

        mov      ebx,dword ptr [esp+2]      ; get entrypoint arg
        xor      eax,eax
        mov      ax,[saveDS]

;
; Setup OS loader's stack. Compute FLAT model esp to id map to
; original stack.
;
        mov      cx,KeDataSelector
        mov      ss,cx
        mov      esp,LOADER_STACK  ;** TMP HACK *** BUGBUG BUGBUG
;
; Load ds and es with kernel's data selectors
;

        mov      ds,cx
        mov      es,cx

;
; Setup pointer to file system and boot context records
;
; Make a linear pointer to the Boot Context Record

        shl      eax,4
        xor      ecx,ecx
        mov      cx,offset _BootRecord
        add      eax,ecx
        push     eax

        push     1010h       ; dummy return address.
        push     1010h       ; dummy return address.

;
; Push 48bit address of loader entry-point
;
        db OVERRIDE
        push    KeCodeSelector
        push    ebx

;
; Pass control to the OS loader
;
        db OVERRIDE
        retf

_TransferToLoader endp

TransferToLoader initialises the protected mode stack and pushes the 32-bit linear address of the _BootRecord into the stack so that the entry point function of OSLOADER.EXE can retrieve it later. It then jumps to the NtProcessStartup function (that is, the entry point of the OSLOADER.EXE).

BootRecord is declared in sudata.asm as follows:

Public _BootRecord
_BootRecord      dw       offset _TEXT:_FsContext
                 dw       SU_LOAD_ADDRESS SHR 16

                 dw       offset _TEXT:_ExportEntryTable
                 dw       SU_LOAD_ADDRESS SHR 16
...

Note that the ExportEntryTable entry of the BootRecord struct (PBOOT_CONTEXT.ExternalServicesTable) is set to _ExportEntryTable. This table is also declared in sudata.asm and contains the addresses of the routines implemented in exp.asm:

;
; This is called the External Services Table by the OS loader
;

align 4
public _ExportEntryTable
_ExportEntryTable equ     $
                 dw       offset _TEXT:RebootProcessor
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetSector
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetKey
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetCounter
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:Reboot
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:AbiosServices
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:DetectHardware
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:HardwareCursor
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetDateTime
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:ComPort
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:IsMcaMachine
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetStallCount
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:InitializeDisplayForNt
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetMemoryDescriptor
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetEddsSector
                 dw       SU_LOAD_ADDRESS SHR 16
                 dw       offset _TEXT:GetElToritoStatus
                 dw       SU_LOAD_ADDRESS SHR 16
                 dd       0

The following is the definition of EXTERNAL_SERVICES_TABLE (ExternalServicesTable of BOOT_CONTEXT) for reference:

typedef struct _EXTERNAL_SERVICES_TABLE {
    VOID (__cdecl *  RebootProcessor)(VOID);
    NTSTATUS (__cdecl * DiskIOSystem)(USHORT,USHORT,USHORT,USHORT,USHORT,USHORT,PUCHAR);
    ULONG (__cdecl * GetKey)(VOID);
    ULONG (__cdecl * GetCounter)(VOID);
    VOID (__cdecl * Reboot)(ULONG);
    ULONG (__cdecl * AbiosServices)(USHORT,PUCHAR,PUCHAR,PUCHAR,PUCHAR,USHORT,USHORT);
    VOID (__cdecl * DetectHardware)(ULONG, ULONG, PVOID, PULONG, PCHAR, ULONG);
    VOID (__cdecl * HardwareCursor)(ULONG,ULONG);
    VOID (__cdecl * GetDateTime)(PULONG,PULONG);
    VOID (__cdecl * ComPort)(LONG,ULONG,UCHAR);
    BOOLEAN (__cdecl * IsMcaMachine)(VOID);
    ULONG (__cdecl * GetStallCount)(VOID);
    VOID (__cdecl * InitializeDisplayForNt)(VOID);
    VOID (__cdecl * GetMemoryDescriptor)(P820FRAME);
#if defined(ELTORITO)
    NTSTATUS (__cdecl * GetEddsSector)(ULONG,ULONG,ULONG,ULONG,PUCHAR);
    NTSTATUS (__cdecl * GetElToritoStatus)(PUCHAR,ULONG);
#endif
} EXTERNAL_SERVICES_TABLE, *PEXTERNAL_SERVICES_TABLE;

Note that the DiskIOSubsystem (= GET_SECTOR) corresponds to the GetSector entry in the external services table. Now lets look at the GetSector function implemented in exp.asm:

EXPORT_ENTRY_MACRO    GetSector
;
; Move the arguments from the caller's 32bit stack to the SU module's
; 16bit stack.
;

        MAKE_STACK_FRAME_MACRO  <GetSectorFrame>, ebx

;
; Go into real mode. We still have the same stack and sp
; but we'll be executing in realmode.
;

        ENTER_REALMODE_MACRO

;
; Get the requested sectors. Arguments on realmode stack
; Make (bp) point to the bottom of the argument frame.
;
        push     bp
        mov      bp,sp
        add      bp,2

;
; Put the buffer pointer into es:bx. Note that and buffer
; addresses passed to this routine MUST be in the lower one
; megabyte of memory to be addressable in real mode.
;

        mov      eax,[bp].BufferPointer
        mov      bx,ax
        and      bx,0fh
        shr      eax,4
        mov      es,ax
;
; Place the upper 2 bits of the 10bit track/cylinder number
; into the uppper 2 bits of the SectorNumber as reguired by
; the bios.
;
        mov      cx,word ptr [bp].TrackNumber
        xchg     ch,cl
        shl      cl,6
        add      cl,byte ptr [bp].SectorNumber

;
; Get the rest of the arguments
;
        mov      ah,byte ptr [bp].FunctionNumber
        mov      al,byte ptr [bp].NumberOfSectors
        mov      dh,byte ptr [bp].HeadNumber
        mov      dl,byte ptr [bp].DriveNumber
...
        int     BIOS_DISK_INTERRUPT

Note that the calling environment (OSLOADER.EXE) is the 32-bit protected mode, while the functions implemented in the exp.asm and BIOS routines are intended for the 16-bit real mode environment.

For this reason, we first temporarily switch back to the real mode (this is as simple as turning off protected mode flag in CR0 and far jumping to the real mode address) and then perform various operations, including calling the BIOS interrupt. The BIOS_DISK_INTERRUPT is, as you may have already expected, 013.

In summary, the ARC firmware emulation, disk operations in particular, takes the path:

  • Standard ARC routine call (e.g. ArcRead) ==>
    • Arc* = SYSTEM_BLOCK->FirmwareVector[*Routine] (e.g. SYSTEM_BLOCK->FirmwareVector[ReadRoutine]) ->
    • SYSTEM_BLOCK = GlobalSystemBlock (e.g. GlobalSystemBlock->FirmwareVector[ReadRoutine]) ->
    • GlobalSystemBlock->FirmwareVector = GlobalFirmwareVectors (e.g. GlobalFirmwareVectors[ReadRoutine]) ->
    • Resolve GlobalFirmwareVectors[operationType] and call (e.g. AERead)
  • AERead calls BlFileTable[FileId].DeviceEntryTable->Read
    • Assuming ArcOpen resolved BIOS driver for this handle, BiosDiskRead or BiosPartitionRead will be called.
  • BiosDiskRead calls MdGetPhysicalSectors
  • MdGetPhysicalSectors calls GET_SECTOR
    • GET_SECTOR = ExternalServicesTable->DiskIOSystem
    • ExternalServicesTable = BootContextRecord->ExternalServicesTable
    • BootContextRecord = _BootRecord (STARTUP.COM)
    • _BootRecord.ExternalServicesTable = _ExportEntryTable
    • _ExportEntryTable.DiskIOSystem = GetSector (STARTUP.COM)
    • Therefore, GET_SECTOR = GetSector (STARTUP.COM)
  • GET_SECTOR invokes int 013

NT 4: Razzle Build Environment Initialisation

The Razzle build environment can be entered through two different methods: razzle.cmd and ntenv.cmd under \nt\public\tools. The ntenv.cmd is responsible for initialising common build environment variables and invoking user-specific initialisation scripts, and razzle.cmd is used to set a few extra environment variables before invoking ntenv.cmd.

The following is the implementation of razzle.cmd:

:START
goto SET_BINARIES_DIR

	if "%PROCESSOR_ARCHITECTURE%" == "ALPHA" set USERNAME=alphafre
	if "%PROCESSOR_ARCHITECTURE%" == "MIPS"  set USERNAME=mipsfre
	if "%PROCESSOR_ARCHITECTURE%" == "PPC"   set USERNAME=ppcfre
	if "%PROCESSOR_ARCHITECTURE%" == "x86"   set USERNAME=x86fre

	if "%PROCESSOR_ARCHITECTURE%" == "ALPHA" set LOGNAME=HALPHAFIX
	if "%PROCESSOR_ARCHITECTURE%" == "MIPS"  set LOGNAME=HMIPSFIX
	if "%PROCESSOR_ARCHITECTURE%" == "PPC"   set LOGNAME=HPPCFIX
	if "%PROCESSOR_ARCHITECTURE%" == "x86"   set LOGNAME=HX86FIX

	set PATH=%PATH%;W:\mstools.nt40;W:\idw.nt40

	set _NTDRIVE=
	if "%1" == "main"           set _NTDRIVE=W:
	if "%1" == "hotfix_free"    set _NTDRIVE=U:
	if "%1" == "hotfix_checked" set _NTDRIVE=V:
	if not "%_NTDRIVE%" == "" goto SET_BINARIES_DIR
	echo !!! missing parameter 'main', 'hotfix_free' or 'hotfix_checked'
	pause
	goto END

:SET_BINARIES_DIR

	rem shift

	if "%PROCESSOR_ARCHITECTURE%" == "ALPHA" set _ntALPHAboot=%_ntdrive%\binaries
	if "%PROCESSOR_ARCHITECTURE%" == "MIPS"  set _ntMIPSboot=%_ntdrive%\binaries
	if "%PROCESSOR_ARCHITECTURE%" == "PPC"   set _ntPPCboot=%_ntdrive%\binaries
	if "%PROCESSOR_ARCHITECTURE%" == "x86"   set _nt386boot=%_ntdrive%\binaries

	cmd /K %_NTDRIVE%\NT\PUBLIC\TOOLS\ntenv.cmd %1 %2 %3 %4 %5 %6 %7 %8 %9

:END

Although the full details are unknown, it seems the razzle.cmd was going through major changes when this source tree was distributed. Note that goto SET_BINARIES_DIR simply bypasses the portion of the script that sets USERNAME and LOGFILE environment variables.

Based on the original script implementation (without bypass), this razzle.cmd would set the USERNAME and LOGFIX variables based on the building system architecture for free build and set the source repository drive (_NTDRIVE) to appropriate repository drive based on the supplied argument (main, hotfix_free, hotfix_checked). I would speculate that this razzle.cmd was originally used on a retail release build machine and then modified for general development use. On a side note, the structure of the razzle.cmd seems to have been completely changed in the Windows 2000 Razzle build environment and the version we have may be the one that is in between the transition process. This theory is also supported by various leaked internal e-mail messages.

Besides what the original razzle.cmd would have done, the script also sets _nt(ARCH)boot environment variable to %_NTDRIVE%\binaries. The purpose of this variable is unclear. One might think it is the path of the final build output, but that is, as will be explained later, _NT(ARCH)TREE. It is likely that this variable is no longer used in the build environment. Once this variable is set, the razzle.cmd invokes ntenv.cmd, the main script responsible for initialising the Razzle build environment.

The following is the implementation of ntenv.cmd:

@rem
@rem If no drive has been specified for the NT development tree, assume
@rem C:.  To override this, place a SET _NTDRIVE=X: in your CONFIG.SYS
@rem
if "%_NTDRIVE%" == "" set _NTDRIVE=W:
@rem
@rem If no directory has been specified for the NT development tree, assume
@rem \nt.  To override this, place a SET _NTROOT=\xt in your CONFIG.SYS
@rem
if "%_NTROOT%" == "" set _NTROOT=\NT
set _NTBINDIR=%_NTDRIVE%%_NTROOT%
@rem
@rem This command file assumes that the developer has already defined
@rem the USERNAME environment variable to match their email name (e.g.
@rem stevewo).
@rem
@rem We want to remember some environment variables so we can restore later
@rem if necessary (see NTUSER.CMD)
@rem
set _NTUSER=%USERNAME%
@rem
@rem Assume that the developer has already included \NT\PUBLIC\TOOLS
@rem in their path.  If not, then it is doubtful they got this far.
@rem
path %PATH%;%_NTBINDIR%\PUBLIC\OAK\BIN
@rem
@rem No hidden semantics of where to get libraries and include files.  All
@rem information is included in the command lines that invoke the compilers
@rem and linkers.
@rem
set LIB=
set INCLUDE=
@rem
@rem Setup default build parameters.
@rem
set BUILD_DEFAULT=ntoskrnl ntkrnlmp daytona -e -i -nmake -i
set BUILD_DEFAULT_TARGETS=-386
set BUILD_MAKE_PROGRAM=nmake.exe
@rem
@rem Setup default nmake parameters.
@rem
if "%NTMAKEENV%" == "" set NTMAKEENV=%_NTBINDIR%\PUBLIC\OAK\BIN
@rem
@rem Setup the user specific environment information
@rem
call %_NTBINDIR%\PUBLIC\TOOLS\ntuser.cmd
@rem
@rem Optional parameters to this script are command line to execute
@rem
%1 %2 %3 %4 %5 %6 %7 %8 %9

First, ntenv.cmd checks if _NTDRIVE is set. If not, it sets the _NTDRIVE, source tree root drive, variable to W:.

Once the root drive is set, the script checks for _NTROOT. _NTROOT is the directory under the source tree root drive that contains the NT source tree (public and private directories). If _NTROOT is not set, the script update its value to \nt. After this, _NTBINDIR variable is set to %_NTDRIVE%%_NTROOT%. By default, the value of _NTBINDIR would be W:\nt.

The next portion sets the _NTUSER variable to %USERNAME%. This variable is later used to load an appropriate build user profile. Note that the %USERNAME% environment variable would be set, by default, to the Active Directory account name of the logged on user. Also note that the USERNAME variable would have been manually set to (ARCH)fre if the original razzle.cmd were called.

Next, the PATH variable is updated to include %_NTBINDIR%\public\oak\bin. By default, this would be W:\nt\public\oak\bin.

As explained in the comments, the LIB and INCLUDE variables, used by compilers and linkers, are undefined since the use of these environment variables are deprecated. This is to ensure that there is no confusion about which header and library files are to be used to build each component. The full/relative path to the headers and libraries will be provided in the command line invoking the compilers and linkers.

Next, the BUILD_DEFAULT, BUILD_DEFAULT_TARGETS and BUILD_MAKE_PROGRAM variables are updated. Note that these variables are the macros for the Build Utility (build.exe). BUILD_DEFAULT macro is used to specify the default options to the Build Utility. In this case, the variable is set to ntoskrnl ntkrnlmp daytona -e -i -nmake -i. This will effectively force the Build Utility to build any optional targets with the name of ntoskrnl or ntkrnlmp or daytona, generate build log, ignore extraneous compiler warning messages, and provide -i option, ignore exit codes from commands, to nmake. The BUILD_MAKE_PROGRAM, as the name suggests, is the name of the MAKE command, which is NMAKE in this case.

Next, the NTMAKEENV variable is, if not set, set to %_NTBINDIR%\public\oak\bin. The NTMAKEENV variable points to the OAK bin directory, which contains the makefile.def. By design, every build-able project directory contains a default Makefile with the following content:

#
# DO NOT EDIT THIS FILE!!!  Edit .\sources. if you want to add a new source
# file to this component.  This file merely indirects to the real make file
# that is shared by all the components of NT OS/2
#
!INCLUDE $(NTMAKEENV)\makefile.def

Note that NTMAKEENV environment variable is referenced here to resolve the path of the directory that contains makefile.def. Since the Build Utility invokes NMAKE after its work is done to perform the actual build, NTMAKEENV must be set to the OAK bin directory.

Next, the ntuser.cmd is invoked to perform user-specific initialisation. After this, the work of ntenv.cmd is done. It calls, if any, whatever was supplied as arguments.

The following is the implementation of ntuser.cmd:

@rem
@rem If no drive has been specified for the NT development tree, assume
@rem W:.  To override this, place a SET _NTDRIVE=X: in your CONFIG.SYS
@rem
if "%_NTDRIVE%" == "" set _NTDRIVE=W:
if NOT "%USERNAME%" == "" goto skip1
echo !!! Error - USERNAME environment varialbe not set in CONFIG.SYS
goto done
:skip1
@rem
@rem This command file is either invoked by NTENV.CMD during the startup of
@rem a Razzle screen group. Or it is invoked directly by a developer to
@rem switch developer environment variables on the fly.  If invoked with
@rem no argument, then restores the original developer's environment (as
@rem remembered by the NTENV.CMD command file).  Otherwise the argument is
@rem a developer's email name and that developer's environment is established.
@rem
if NOT "%1" == "" set USERNAME=%1
if "%_NTUSER%" == "" goto skip2
if "%1" == "" if "%USERNAME%" == "%_NTUSER%" alias src /d
if "%1" == "" set USERNAME=%_NTUSER%
:skip2
@rem
@rem Most tools look for .INI files in the INIT environment variable, so set
@rem it.  MS WORD looks in MSWNET of all places.
@rem
set INIT=%_NTBINDIR%\private\developr\%USERNAME%
set MSWNET=%INIT%
@rem
@rem Load CUE with the standard public aliases and the developer's private ones
@rem
if "%_NTUSER%" == "" goto skip3
@rem
@rem Initialize user settable NT nmake environment variables
@rem
set NTPROJECTS=public
set NT386FLAGS=
set NTMIPSFLAGS=
set NTCPPFLAGS=
set NTDEBUG=cvp
set BUILD_OPTIONS=
set 386_OPTIMIZATION=
set 386_WARNING_LEVEL=
alias src > nul
if NOT errorlevel 1 goto skip4
alias -p remote.exe -f %_NTBINDIR%\private\developr\cue.pub -f %_NTBINDIR%\private\developr\ntcue.pub -f %INIT%\cue.pri
alias -f %_NTBINDIR%\private\developr\cue.pub -f %_NTBINDIR%\private\developr\ntcue.pub -f %INIT%\cue.pri
goto skip4
:skip3
alias src > nul
if errorlevel 1 goto skip4
alias -f %_NTBINDIR%\private\developr\cue.pub -f %INIT%\cue.pri
:skip4
@rem
@rem Load the developer's private environment initialization (keyboard rate,
@rem screen size, colors, environment variables, etc).
@rem
call %INIT%\setenv.cmd
echo Current user is now %USERNAME%
:done

As with ntenv.cmd, the _NTDRIVE environment is checked and, if not set set to W:. Next, USERNAME variable is checked and if it is empty, an error message is displayed. The USERNAME variable should either be the Active Directory account name of the logged on user, or special predefined username profile names (e.g. x86fre) for release.

If USERNAME is set, the script checks if there is an argument provided. If there is, the USERNAME variable is updated to the value of the argument. This is to allow the active user profile to be switched among different users as described in the comment.

After this, _NTUSER variable is checked and, if is not set, the next two lines are skipped. Note that _NTUSER is set to USERNAME in ntenv.cmd and this represents the initial/default username which ntuser.cmd should return to if no specific username is provided in the arguments.

If _NTUSER is set and no argument is provided (that is, return to initial/default user profile), the src alias is deleted. The src alias, by default, translates to cd /d %_NTDRIVE%\nt\private. On a side note, the alias command sets the command aliases for the shell. For example, if one were to type src on the command line, it will internally execute cd /d %_NTDRIVE%\nt\private and change the current directory to the private directory. After this, the USERNAME is set to _NTUSER, the default username, if no argument is provided.

The next part sets INIT and MSWNET variables. As described in the comments, many tools look for INI setting files from the INIT and this is set to the %_NTBINDIR%\private\developr\%USERNAME%. The developr directory contains user/developer profile directories.

Next, if _NTUSER is not set- that is, if no default user is set, the script jumps to :skip3 and then :skip4. Otherwise, it continues and sets various macros referenced by the makefile.def and sets various public and private aliases for both remote.exe and cmd.exe. The remote.exe is a remote shell access program that allows commands to be executed from a remote host, allowing the build process to be controlled remotely. %_NTBINDIR%\private\developr\cue.pub and ntcue.pub contain public (global) aliases, and %INIT%\cue.pri contains private aliases defined by the developer.

At :skip4, the %INIT%\setenv.cmd, which contains user-defined environment variable settings, is invoked and a message identifying the active user profile is displayed.

For setenv.cmd, I will provide x86fre and x86chk as examples. These user profiles are located under \nt\private\developr\[x86fre/x86chk].

We will first take a look at the setenv.cmd.

set BINPLACE_FLAGS=-x -a
set BINPLACE_LOG=%_NTBINDIR%\binplace.log
set REBASE_FLAGS=-p
set NTCPPFLAGS=-D_IDWBUILD -DRDRDBG -DSRVDBG

It begins with tool configurations. BINPLACE_FLAGS is set to -x -a, where -x is used to strip private symbols out of the binplaced executables and -a is used to save all symbols to a separate file. BINPLACE_LOG path is set to %_NTBINDIR%\binplace.log, which would be W:\nt\binplace.log by default. REBASE_FLAGS is set to -p, which removes all private debug symbols from the re-based file.

if "%_blddrive%"=="" set _blddrive=w:
set PATH=%PATH%;%_NTBINDIR%\private\mvdm\tools16;%_NTBINDIR%\public\tools;%_blddrive%\bldtools\qfe\nt40

Now, a new variable called _BLDDRIVE is, if not already set, set to W:. _BLDDRIVE is only used in the following line for updating the PATH environment variable. The PATH environment variable is updated to include the MVDM tools16 directory (contains 16-bit tools, including masm386, cl16, link16, etc.) and public\tools directory, which contains a number of build/repository maintenance scripts. It also adds W:\bldtools\qfe\nt40, which I am not sure what they are supposed to contain.

set DIRCMD=/o:gn
set ntvdm_based_build=yes
set _STATOPTIONS=fc

DIRCMD is updated to /o:gn to order entries by name and with directories first (by default, dir will order files and directories by name simultaneously). ntvdm_based_build is set to yes. This variable is resolved by \nt\private\mvdm\softpc.new\bios component to determine the tools to use. With ntvdm_based_build defined, the build makefile will use link16 and reloc available in PATH instead. The purpose of _STATOPTIONS is not very clear at the moment. For now, it is deemed to be unimportant.

set NT_ROOT=%_NTBINDIR%
set NTOS_ROOT=%_NTBINDIR%\private\ntos
set NW_ROOT=%_NTBINDIR%\private\nw

After that, we have more path settings ..

set MARS_PCH=1
set CAIRO=
set USE_BUILD=1

Once again, the purpose of MARS_PCH is unclear- well, based on its name, it probably enables precompiled headers for a component called MARS. I just dont know what MARS is. Next, CAIRO is undefined since this is not a Cairo build. The purpose of USE_BUILD is also unclear. Based on the content analysis of the source tree, it seems these variables were used earlier in the development stage, but are no longer used.

set BH_ROOT=%_NTBINDIR%\private\net\bh
set BH_BUILD=%_NTBINDIR%\private\net\bh\build

We have more path settings

set BINROOT=\binaries
set BUILD_MULTIPROCESSOR=1
set NTDBGFILES=1

BINROOT variable is set to \binaries. This variable is referenced by *fre/*chk cue.pri alias definitions, as well as bindsys.cmd. cue.pri defines the bin alias to be %BINDRIVE%: && cd %BINROOT%. BUILD_MULTIPROCESSOR is set to 1, which is referenced by the Build Utility to enable multithreaded build. NTDBGFILES is set to 1 and this enables the makefile.def to set BINPLACE_DBGFLAGS_NT to -S $(_NTTREE)\Symbols, thereby setting the binplace symbol output path.

set BINDRIVE=%_ntdrive%
set BINARIES=%BINDRIVE%\binaries
set CAIROBINS=%BINARIES%\cairo

BINDRIVE is set to _NTDRIVE and BINARIES is set to _NTDRIVE\binaries. The BINARIES variable is referenced by CAIROBINS and other scripts to set the final build output directory. CAIROBINS is also used in the similar manner as BINARIES to set the final Cairo build output directory. Since our build target is NT, Cairo variables are not important to us.

set NTUSERK=1
set TMP=%_NTDRIVE%\tmp
set WOWTOO=1

The only place in the source tree NTUSERK is referenced is in \nt\private\mvdm\wow16\user:

# Check to see if we're being invoked from ntuser\client

!IFDEF NTUSERK
!UNDEF TUKWILA
!ENDIF

TUKWILA is not referenced anywhere under mvdm and it seems to be another remnant of the old build process- in short, it is not important.

TMP is set to %_NTDRIVE%\tmp, and this is, obviously, the temporary directory path. The TMP variable is referenced by various component build makefiles and scripts for temporary directory path.

WOWTOO is set to 1 and this forces \nt\private\mvdm\wow16\user to be built while building \nt\private\ntos\w32\ntuser.

if not exist %_NTDRIVE%\tmp mkdir %_NTDRIVE%\tmp
echo off
call %_NTBINDIR%\private\developr\%USERNAME%\SetBldOp
echo on
call %_NTBINDIR%\private\developr\%USERNAME%\%USERNAME%
cls

Next, the tmp directory is created if one doesnt already exist. Note that multiple directory references in the build scripts are uncorrelated- for example, we have mkdir %_NTDRIVE%\tmp, when it should really be %TMP%. This is kind of atrocity is quite common in the version of Razzle we have, and should be carefully examined.

After this, we just call SetBldOp.cmd, and then %USERNAME%.cmd. With x86fre, it will be calling %_NTBINDIR%\private\developr\x86fre\x86fre.cmd.

Lets analyse whats inside SetBldOp.cmd:

@rem
@rem where all good build options go to be accounted for
@rem

echo off

set build_options=accesory
set build_options=%build_options% accupd
set build_options=%build_options% adaptec
set build_options=%build_options% afd
set build_options=%build_options% all_kbds
set build_options=%build_options% amd
set build_options=%build_options% apps
set build_options=%build_options% arcinst
set build_options=%build_options% arctest
set build_options=%build_options% bintrack
set build_options=%build_options% bowser
set build_options=%build_options% bugboard
set build_options=%build_options% cap
set build_options=%build_options% cdfs
set build_options=%build_options% chk
set build_options=%build_options% chkalive
set build_options=%build_options% clntnb
set build_options=%build_options% clntspx
set build_options=%build_options% clnttcp
set build_options=%build_options% cluster
set build_options=%build_options% compdir
set build_options=%build_options% control
set build_options=%build_options% creatdll
set build_options=%build_options% creative
set build_options=%build_options% crt
set build_options=%build_options% cuntfs
set build_options=%build_options% data
set build_options=%build_options% daytona
set build_options=%build_options% dce
set build_options=%build_options% decmon
set build_options=%build_options% dfs
set build_options=%build_options% dgipxc
set build_options=%build_options% dgipxs
set build_options=%build_options% dgudpc
set build_options=%build_options% dgudps
set build_options=%build_options% dhcpins
set build_options=%build_options% diskedit
set build_options=%build_options% dlc
set build_options=%build_options% dlgedit
set build_options=%build_options% dosdev
set build_options=%build_options% dphhogs
set build_options=%build_options% dskimage
set build_options=%build_options% editreg
set build_options=%build_options% ep
set build_options=%build_options% exchange
set build_options=%build_options% execmail
set build_options=%build_options% fastimer
set build_options=%build_options% fax
set build_options=%build_options% fontedit
set build_options=%build_options% games
set build_options=%build_options% gutils
set build_options=%build_options% halncr
set build_options=%build_options% he
set build_options=%build_options% hpmon
set build_options=%build_options% hu
set build_options=%build_options% imagedit
set build_options=%build_options% inet
set build_options=%build_options% internet
set build_options=%build_options% jet
set build_options=%build_options% linkinfo
set build_options=%build_options% lmmon
set build_options=%build_options% logger
set build_options=%build_options% locator
set build_options=%build_options% masm
set build_options=%build_options% mini
set build_options=%build_options% mp
set build_options=%build_options% mstest
set build_options=%build_options% mup
set build_options=%build_options% nbt
set build_options=%build_options% ndis
set build_options=%build_options% ndrdbg
set build_options=%build_options% net
set build_options=%build_options% netbios
set build_options=%build_options% netcmd
set build_options=%build_options% netflex
set build_options=%build_options% newinvtp
set build_options=%build_options% npfddi
set build_options=%build_options% ntbackup
set build_options=%build_options% ntbakems
set build_options=%build_options% nw
set build_options=%build_options% nwc
set build_options=%build_options% objdir
set build_options=%build_options% 
set build_options=%build_options% 
set build_options=%build_options% ole
set build_options=%build_options% ole2map
set build_options=%build_options% ole2ui32
set build_options=%build_options% ole32
set build_options=%build_options% oleprop
set build_options=%build_options% oletools
set build_options=%build_options% oleutest
set build_options=%build_options% opengl
set build_options=%build_options% optlayts
set build_options=%build_options% otnboot
set build_options=%build_options% printers
set build_options=%build_options% proxstub
set build_options=%build_options% pviewer
set build_options=%build_options% random
set build_options=%build_options% ras
set build_options=%build_options% rcdump
set build_options=%build_options% rdr
set build_options=%build_options% rdr2
set build_options=%build_options% readline
set build_options=%build_options% reality
set build_options=%build_options% roshare
set build_options=%build_options% routing
set build_options=%build_options% rpcsign
set build_options=%build_options% ru
set build_options=%build_options% scsiwdl
set build_options=%build_options% seclist
set build_options=%build_options% setlink
set build_options=%build_options% sfm
set build_options=%build_options% simbad
set build_options=%build_options% slcd
set build_options=%build_options% sleep
set build_options=%build_options% slmnew
set build_options=%build_options% smbtrace
set build_options=%build_options% smbtrsup
set build_options=%build_options% sndblst
set build_options=%build_options% snmp
set build_options=%build_options% sockets
set build_options=%build_options% sol
set build_options=%build_options% solidpp
set build_options=%build_options% spy
set build_options=%build_options% srv
set build_options=%build_options% streams
set build_options=%build_options% svrnb
set build_options=%build_options% svrspx
set build_options=%build_options% svrtcp
set build_options=%build_options% symbios
set build_options=%build_options% tail
set build_options=%build_options% takeown
set build_options=%build_options% tapi
set build_options=%build_options% tcpip
set build_options=%build_options% tdi
set build_options=%build_options% testprot
set build_options=%build_options% tile
set build_options=%build_options% tlibs
set build_options=%build_options% ui
set build_options=%build_options% unimodem
set build_options=%build_options% ups
set build_options=%build_options% usl
set build_options=%build_options% uspifs
set build_options=%build_options% usr
set build_options=%build_options% vctools
set build_options=%build_options% vdmredir
set build_options=%build_options% vi
set build_options=%build_options% view
set build_options=%build_options% wangview
set build_options=%build_options% wap
set build_options=%build_options% windiff
set build_options=%build_options% winhelp
set build_options=%build_options% winvtp
set build_options=%build_options% wst
set build_options=%build_options% wx86shl
set build_options=%build_options% xerox
set build_options=%build_options% zoomin
goto set%processor_architecture%

goto end
:setx86
set build_options=%build_options% amd
set build_options=%build_options% cpqfws2e
set build_options=%build_options% detect
set build_options=%build_options% flashpnt
set build_options=%build_options% halncr
set build_options=%build_options% masm
set build_options=%build_options% thunk32
set build_options=%build_options% 

goto end
:setmips
set build_options=%build_options% amd

goto end
:setalpha
set build_options=%build_options% a2coff

goto end
:setppc
set build_options=%build_options% cs423x wd90c24a

:end

Basically what we have is the full list of OPTIONAL_DIRS that are to be built during this build. Note that BUILD_OPTIONS is the Build Utility macro that enables the utility to build all optional components specified in it.

Now, before I get to x86fre.cmd, I need to make this clear. The \nt\private\developr directory has the following user profiles:

alphachk
alphafre
mipschk
mipsfre
ppcchk
ppcfre
x86chk
x86fre

All these directories contain setenv.cmd, setbldop.cmd, and all archs chk and fre.cmds. For example, \nt\private\developr\x86fre will contain alphachk.cmd and \alphachk will contain x86fre.cmd. And the contents of x86fre.cmd in \x86fre and \alphachk are in fact the same, and this is not a limited to x86fre.cmd.

In other words, in short, the contents of all of the above listed directories are the same. The file that ultimately makes arch/build-specific environment settings is (arch)(chk/fre).cmd file.

Now that this should be clear, we will look at x86fre.cmd:

set _NT386TREE=%BINARIES%\nt
set _CAIRO386TREE=%CAIROBINS%\nt

REM 
REM bwill 8/2/96 - not sure what these lines are for,
REM                so I'm commenting them out of the
REM                QFE build.
REM
REM set FreeBuild=\\X86Fre\Binaries
REM set FreeCBuild=\\X86Fre\CairoBin
REM

set NTDEBUG=
set NTBBT=
set MACHINENAME=x86fre
set CheckInNtverp=

REM
REM bwill 9/17/96 - added files necessary for 
REM                 rebasing.
REM
set REBASELANG=usa
set _QFE_BUILD=1

REM
REM bwill 9/18/96 - added files for lego
REM
set _BLDTOOLS=%_NTDRIVE%\bldtools\qfe\nt40

_NT386TREE is set to %BINARIES%\nt, where BINARIES was set earlier in setenv.cmd. By default, it would translate to W:\binaries\nt. This variable dictates the final build output directory. Next, _CAIRO386TREE is set, but since we are not building Cairo, we dont care about it.

After that, we have FreeBuild commented out by bwill. As far as I understand, FREEBUILD is referenced by makefile.def to set debug output type and linked library types, and also by tens of other build scripts to check if the current build is free build or not. Since this is indeed a free build (x86fre), FreeBuild should be set? Meanwhile, the reason that FreeBuild is set to a path rather than just 1 (all known references use FREEBUILD as !IFDEF) is unclear. This may as well be another remnant of the past build process.

Next, NTDEBUG is undefined. NTDEBUG is a Build Utility macro that sets the debug information type. In this case, no debug information will be included. NTBBT supposedly enables lego-izable building. MACHINENAME is set to x86fre and CheckInNtverp is undefined. The purpose of CheckInNtverp is not completely clear, but it is speculated to be associated with the repository management system (slm). Though, the version we have does not reference this environment variable at all.

REBASELANG is set to usa, and this is referenced in the ntrebase.cmd, used to rebase built executable files. _QFE_BUILD is also used in the ntrebase.cmd to select rebase process. If _QFE_BUILD is set, a rebase procedure that is different from the retail one is executed. The purpose and content of _BLDTOOLS are unclear.

It seems that the x86fre.cmd was specifically modified after the original release for QFE (Quick Fix Engineering) build. I am not certain about the full QFE build environment, but it is not critical for our purposes. The QFE portion can be ignored.

The following is x86chk.cmd for comparison:

set _NT386TREE=%BINARIES%
set _CAIRO386TREE=%CAIROBINS%
set FreeBuild=\\X86Fre\Binaries
set FreeCBuild=\\X86Fre\CairoBin
set nt_up=0
set BKUPDRIVE=e:
set MACHINENAME=x86chk
set _TARGET=i386
set path=%path%;d:\slick;c:\bldtools\%_TARGET%

There is little difference between the x86fre.cmd and x86chk.cmd. Some notable differences are: FreeBuild is set, NT_UP is un-set (multi-processor target is enabled), BKUPDRIVE is set, MACHINE name is set to x86chk and different PATH.

NT 4: Razzle Build Environment Directory Tree Structure

In this article, I will explain how the NT 4 source tree is structured.

The root directory, nt, is located under the source drive root. For example, if the source drive letter is W:, the location of the nt directory would be W:\nt. The nt directory contains all files necessary to build the core operating system and various other components.

The nt directory contains two main directories: private and public. The private directory contains the source code required to build all components and the public directory contains the build scripts and SDK components necessary to complete the build.

We will first take a look at the public directory. The public directory contains three sub-directories: oak, sdk, tools.

oak: OEM Adaptation Kit
bin: contains the build process definitions (makefile.def, *mk.inc)
inc: contains all header files required for OEM adaption of the NT 4. This directory is not important for our purposes.

The highlight of the oak directory is the bin directory, which contains the build process definition files. As explained in the NT 4: Razzle Build Environment – Build Utility article, the makefile.def is used by the Build Utility, which is used for building the NT source tree.

sdk: Software Development Kit
- bin: SDK binaries, not used
- inc: SDK headers, is updated with the header files from the private directory during the build process. It is also referenced throughout the build process by almost all components.
- lib: SDK libraries, contains the library (.lib) files built during the build process. This directory is also referenced by various components for linking. In addition to the libraries, it contains coffbase.txt and placefil.txt.
- rpc16: 16-bit RPC Development Kit, is not used

The sdk directory is possibly the most important directory under public. It is updated and referenced throughout the build process and will contain all SDK headers and libraries by the end of the build. It also serves as an SDK for building high level components because all base system components are built prior to them.

The tools directory contains various build scripts necessary to maintain the source tree and set up the build environment. The most notable one would be ntenv.cmd, which is called by a number of different script files to initialise the Razzle build environment. In addition to that, it also contains the razzle.cmd, which is used to set a few additional parameters prior to calling ntenv.cmd.

The private directory contains the source code implementations of various components. The NT 4 source tree is organised in such way that each directory represents a project or a component.

In general, all directories directly or one level under the private directory contain major system components, and sub-directories of the directories contain sub-components.

The following is the the list of directories directly under the private directory:

private
- bldtools: Build Scripts
- crt32: C Runtime Library Source Code
- crtlib: C Runtime Library Build Scripts/Output Directory
- csr: Client-Server Runtime Subsystem (CSRSS)
- dcomidl: DCOM IDL Files
- developr: Developer Profile Directory (contains environment configurations)
- eventlog: Event Logging System
- exchange: Microsoft Exchange Mail Server source code (not available)
- fp32: C Runtime Library Floating-point Library Source Code
- inc: NT Global Headers
- inet: Inet/Internet Components
- lsa: Local Security Authority Subsystem (LSASS)
- mvdm: Multiple Virtual DOS Machine
- net: Net/Network Components
- newsam: Security Account Manager (SAM)
- newsam2: Security Account Manager (SAM) (unused)
- nls: Native Language Support Message Files
- nlsecutl: NLREPL SAM Database Replication Functions
- ntos: NT Core Operating System, including kernel and core system components
- nullsrv: Null Server (LPC benchmark)
- nw: NetWare Support Components
- ole2ui32: OLE 2.0 User Interface Support Library
- ole32: OLE Components
- oleauto: OLE Automation Components
- oletools: Unused
- oleutest: OLE Unit Tests
- os2: OS/2 Subsystem
- posix: POSIX Subsystem
- rpc: Remote Procedure Call (RPC) Components
- rpcutil: Common RPC Functions
- sam: NOT Security Account Manager (SAM), contains icfg32
- sdktools: SDK Tools Source Code
- sm: Session Manager
- tapi: Telephony API
- types: Common Types
- types2: Common Types
- unimodem: Unimodem, modem support system
- urtl: User Mode Runtime Library
- utils: System Utilities (Console Commands)
- wangview: Stuff from Wang Laboratories
- windbg: WinDbg
- windows: Windows Shell Components

Most of the directories list above contain numbers of directories that contain sub-components.

NT 4: Razzle Build Environment Build Utility Reference

This article contains the full reference information for the NT 4 Build Utility (build.exe). For the blog article regarding the Build Utility, read NT 4: Razzle Build Environment Build Utility. The content of this article is based on that of the build.hlp file.

Environment Variables

BUILD_OPTIONS: Use this to specify the OPTIONAL_DIRS to process. This is equivalent to adding the components listed under the dirs file OPTIONAL_DIRS parameter as Build Utility arguments.

BUILD_DEFAULT: Use this to specify the options to build. For example, if BUILD_DEFAULT is set to -M 16, the build utility will automatically use the -M 16 option even if you dont manually supply it in the arguments.

BUILD_MAKE_PROGRAM: Use this to specify the make program to execute. The default is nmake.exe.

BUILD_DEFAULT_TARGETS: Use this to specify the default target platform for which you are building. Assuming cross compilers existed, you could set it to -MIPS on an X86 and build MIPS binaries.

BUILD_ALT_DIR: Use this to specify an alternate object directory name. It must be ten characters or less and contain no spaces. The value is added to the end of the .obj name and (if no -j switch is used for build.exe) the logfile name. For example, if BUILD_ALT_DIR is set to Debug, it would generate objDebug for the object output directory and buildDebug.log/wrn/err.

Macro Definitions

INCLUDES: Use this macro in your sources file to indicate to the Build Utility where to find the headers that you are including in your build. Specify a list of the paths to be searched for include files during compilation. Separate the entries in this list with a semicolon. Path names can be absolute or relative.

SOURCES: The SOURCES macro is the most important macro for the Build Utility. You must have this macro in your sources file. The SOURCES macro specifies which files are going to be compiled. The Build Utility will look at these files and generate a dependency list. If any of those dependencies change, the Build Utility will rebuild this source file.

TARGETEXT: Use this macro to specify the extension name (such as .cpl) when you want the DLLs to have something other than .dll as the filename extension. If you specify something unexpected, you will see a message Unexpected Target Ext.

TARGETNAME: Use this macro to specify the name of the library being built, excluding the filename extension. You must have this macro in your sources file.

TARGETPATH: Use this macro to specify the target directory name that is the destination of all build products ( such as .exe, .dll, and .lib files). Notice that object files always end up in the obj subdirectory. You must have this macro in your sources file.

TARGETTYPE: Use this macro to specify the type of product being built. This is typically LIBRARY or DYNLINK (for DLLs), but can take other values. This is a required field in the sources file. TARGETTYPE gives the Build Utility  some clues about some of the input files that it should be expecting. You must have this macro in your sources file. The valid values for TARGETYPE include:

PROGLIB an executable that exports something. Its a program library.
PROGRAM This is just a plain, program file that does not export anything. It just imports stuff in the default .exe on the command line.
DYNLINK - A DLL, a control panel applet, anything that can be dynamically loaded or that people can import from that has to have when its linked, it uses the DLL switch to the linker to indicate its not a standalone .exe. Its actually something thats dynamically linked. When you build a dynamic link, you may also need to set the TARGETEXT macro.
LIBRARY A component library. This is a library of objects, not an import library (an import library is built as a side effect of building a dynamic link. Anytime you build a dynamic link, you get a .lib file and a .dll file. When you build a library, you just get a .lib file.)
DRIVER A system kernel driver.
EXPORT_DRIVER An export driver is like a driver except it exports things. It provides services to other drivers. There are two of those in Windows NT.
HAL Hardware Abstraction Layer.  This is the kernel HAL for Windows NT.
BOOTPGM A kernel driver.
MINIPORT A kernel driver.
GDI_DRIVER A kernel driver that is similar to a DLL, which is loaded in user and kernel space.

UMAPPL: This macro enables you to build multiple targets from a single subdirectory where every target is a source file and some other .lib files that you link against. To use this macro, specify a list of source filenames containing a main function. Specify these filenames without filename extensions and separate them with an asterisk.

UMAPPLEXT: Use this macro to specify the extension name (for example .COM or .SCR) that will be appended when you want image files to have multiple filename extensions when you are building from a single source file. Use UMAPPLEXT when you want the extension to be something other than .exe. If you want the filename extension to be .exe, use UMAPPL.

UMLIBS: Use this macro to specify a list of library path names to be linked to the files specified in UMTEST or in UMAPPL. Include the library generated by the sources file. Separate the entries in this list with spaces or tabs.

BASEDIR: Use the BASEDIR macro when referring to the base of the source tree. By default, the source tree starts at $(_NTDRIVE)\nt, but its not required. By using BASEDIR to refer to the base, you abstract out this dependency.

BINPLACE_FLAGS: Use this macro to specify arguments that you want to pass to binplace.exe.

BINPLACE_PLACEFILE: Use this macro to specify the placefile used by binplace. If nothing is listed, $(BASEDIR)\public\sdk\lib\placefil.txt is used by default.

C_DEFINES: Use this macro to specify switches you want passed to the compiler. Typically, they are compiler #defines.

COFFBASE: Use this macro to specify the name to look up in COFFBASE_TXT_FILE. If you do not specify a name, it defaults to the value of TARGETNAME.

COFFBASE_TXT_FILE: The name of the file passed to the linker with the base addresses for the images you build. By default, this is $(BASEDIR)\public\sdk\lib\coffbase.txt.

COMPILER_WARNINGS: The name of the warning file passed to the compiler with the /FI switch. By default, this is $( BASEDIR)\public\sdk\inc\warning.h.  The file contains a list of compiler pragmas used to disable, enable, or promote warnings for the entire build.

CRT_INC_PATH: Specifies the path to the C Runtime headers. The default is $(BASEDIR)\public\sdk\inc\crt

CRT_LIB_PATH: Use this macro to specify the path to the C Runtime libraries. The default is $(BASEDIR)\public\sdk\lib\*

DEBUG_CRTS: Whether you build checked or free system, the Build Utility always links against the retail runtime libraries and the retail MFC. If you want to link against the debug MFC and the debug runtime libraries, set this macro to 1.

DLLBASE: You only need to use this macro when your TARGETTYPE macro is set to DYNLINK. Use it to set the base address for the DLL image you are creating. If you do not specify an address, the Build Utility will assume that the target name in coffbase.txt is the name of your image. You can override this default target name by specifying a target name with the DLLBASE macro. You can set DLLBASE to be the hard-coded base address, a hex address, or you can leave it blank. If you leave it blank, the Build Utility will always look up the target name specified in coffbase.txt.

DLLDEF: Use this macro to specify the name of the .def file that the Build Utility will pass to the librarian when building the export and import files. If you do not set this, the Build Utility will assume it is the same name as the image you are building.

DLLENTRY: Use this macro to specify the DLL entry point. By default, no entry point is assumed. For example, when you bring over programs that were built in the VC build environment or use the C Run-time, you will probably set _DllMainCRTStartup.

DLLLIBOBJECTS: Use this macro to specify extra objects to add to an import library.

DLLORDER: When you are building a DLL, you can specify an order file that will be passed to the linker. The order file lists the functions and the order in which they should be linked together. By default, the Build Utility passes the name of the DLL as the name of the order file. For example, if you are building kernel32.dll, the Build Utility expects kernel32.prf as the order file.

DRIVERBASE: Similar to DLLBASE and UMBASE, use this macro to specify the base address for a driver. Its generally not necessary to set this because it will be relocated at run time anyway.

EXEPROFILEINPUT: This macro has been changed and is now exactly the same as NTPROFILEINPUT.

FREEBUILD: Use this macro to specify whether your build is checked (debug) or free (retail). You can say, “If $FREEBUILD, do things for a retail build.” Maybe you want to set a different set of flags or put it in a different place or compile a different way from a checked (or debug) build.

GPSIZE: Specify a value for this macro in your sources file to control the GPSIZE switch to the linker. The GPSIZE is an optimization used on RISC platforms (on MIPS and PowerPC only). The value used for Windows NT is 32.

HALBASE: Similar to DLLBASE and UMBASE,  use this macro to specify the base address for a HAL. Its generally not necessary to set this because it will be relocated at run time anyway.

IDL_RULES: This macro is only used for Cairo and OLE builds.

IDL_TYPE: When you specify an IDL file in your sources rule, you have to specify whether this is an OLE IDL or an RPC IDL because their syntax differs. Based on the syntax, the Build Utility passes different commands to MIDL. The default for IDL_TYPE is OLE.

LANGUAGE: Use this macro to specify the language when you set up dependencies so that you can include country-specific parts in your build. The default is LANGUAGE=USA. This macro is not used often.

LINKER_FLAGS: Use the LINKER_FLAGS macro to override any default linker switch that you want to pass to the linker.

LINKER_NOREF: Use this macro to turn off switches to the linker. The Build Utility turns some switches on, by default. One of them is the OPTICAL_AND_REF switch, which says, Throw out everything thats not referenced in this module. This is the right thing to do if you want small modules, but if youve got some debug routines in there that you want to call from the debugger, its kind of annoying to have them all thrown away on you. To avoid this, set LINKER_NOREF in your environment, rebuild your product, load it up in the debugger, and then run these functions that are only used in the debug scenario.

LINKLIBS: Use the LINKLIBS macro to specify libraries that you need to link against. LINKLIBS enables you to specify a macro called PERFLIBS, which are extra libraries you use for a performance case. The only difference between TARGETLIBS and LINKLIBS is the ordering on the command line. LINKLIBS usually gets passed first; TARGETLIBS gets passed second.

MAJORCOMP: Use this major component macro to specify the first part of a filename you are building for use by the ALPHA and MIPS compiler.

MAKEDLL: The build process is a two-pass build. In the first pass, the Build Utility compiles all the source files, and creates import libraries and component libraries. In the second pass, it links everything against those libraries. Setting this macro to 1 will force NMAKE to process the second pass.

MASTER_VERSION_FILE: The Master Version File for Windows NT is called ntverp.h and its located in \public\sdk\inc. You can either use ntverp.h or use the MASTER_VERSION_FILE macro to specify a different master version file.

MFC_FLAGS: If you have extra command line options you want to pass to the compiler, you can put them in MFC_FLAGS. They will only affect programs that use MFC.

MFC_INC_PATH: Use this macro to specify the path to the MFC headers. The default is $(BASEDIR)\public\sdk\inc\mfc$(MFC_VER)

MFC_INCLUDES: If you have your own MFC include files, use this macro to specify where you put your MFC headers on your system.

MFC_LIB_PATH: Use this macro to specify the path to the MFC libraries. The default is $(BASEDIR)\public\sdk\lib

MFC_LIBS: Use this macro to provide explicit MFC library names and override the default names that the Build Utility uses.

MFC_VER: Use this macro to specify the version of MFC to build with. By default, it is set to 40. Valid values must be 40 or greater. To use MFC 3.x, define USE_MFC30.

MIDL_OPTIMIZATION: Use this macro to override the default optimization that is passed to the MIDL compiler. The default is OI2.

MIDL_UUIDDIR: Use this macro to specify where the GUID file goes when you generate an OLE IDL file (UUIDs and GUIDs are the same thing). By default everything built in pass zero goes to wherever you set the TARGETPATH subdirectory. Use this macro to override that.

MINORCOMP: Use this macro to specify the second part of the filename constructed for use by the MIPS compiler. This is required only for MIPS and ALPHA builds.

MISCFILES: Use the MISCFILES macro to list items that you want to put into the appropriate installation point when the Build Utility runs binplace.exe.

MSC_OPTIMIZATION: Use this macro to override the default optimization the Build Utility uses on the compiler. By default, everything is optimized. If you want to turn off optimization to step through your code, you can set MSC_OPTIMIZATION to whatever is appropriate for your compiler. Platform-specific optimization flags may be defined as follows (if defined, it will override MSC_OPTIMIZATION):

ALPHA_OPTIMIZATION
386_OPTIMIZATION
MIPS_OPTIMIZATION
PPC_OPTIMIZATION

MSC_WARNING_LEVEL: Use this macro to set the warning level to use on the compiler. The default is W3. After you have your code building without errors, you probably want to change MSC_WARNING_LEVEL to /W3/WX, which always makes any warnings an error. Platform-specific warning level flags may be defined as follows (if defined, it will override MSC_OPTIMIZATION):

ALPHA_WARNING_LEVEL
I386_WARNING_LEVEL
<- IS THIS A TYPO?
MIPS_WARNING_LEVEL
PPC_WARNING_LEVEL

NOLINK: Similar to MAKEDLL, NOLINK forces NMAKE to process the first pass. You can say, if NOLINK=1, then you know you are in pass one; if MAKEDLL=1, then you know you are in pass two because in the first pass you do not want to link and in the second pass you just want to make the DLL.

NOMFCPDB: By default whenever you are building an MFC program, the Build Utility generates the symbolic debugging information in a PDB file (a program database). MFC has so much stuff, the Build Utility puts it in the PDB. If you define NOMFCPDB, it wont.

NO_NTDLL: Use this macro to indicate that NTDLL.LIB should not be automatically added to the library list.

NT_INST: This macro is used internally by the Windows NT build group to specify instrumentation. Some components for Windows NT use it to enable some special instrumentation when building.

NT_UP: Use this macro to indicate whether your driver will run on a uniprocessor machine or multiprocessor machine. The default is uniprocessor (NT_UP=1).

NTDBGFILES: Use this macro to control whether symbols should be stripped from final image files when the Binplace Utility is run by the Build Utility.

NTDEBUG: Use this macro to specify what type of symbolic information you want when building (and therefore which debugger youll be using).  It is rarely used in the sources file, instead it should be set in the environment before building the project.

NTDEBUGTYPE: Use this macro to specify the method of linking, which symbols are used.

ntsd For debugging with NTSD
windbg CodeView debug format
coff
both enables both ntsd and windbg

NTTARGETFILE0: You can define NTTARGETFILE0 to include .\makefile.inc immediately after it specifies the top level targets (all, clean and loc) and their dependencies. The makefile.def file expands NTTARGETFILE0 as the first dependent for the all target. The fact that NTTARGETFILE0 exists, even if it is defined to nothing, means that the Build Utility should open makefile.inc in the same subdirectory as the sources file. If you set NTTARGETFILE0, what you are saying is, “Not only should you include makefile.inc, but you should also build the thing that the macro defines.”

NTTARGETFILE1: NTTARGETFILE1 is exactly like NTTARGETFILE0, except that it happens later in the build process. NTTARGETFILE0 happens on pass zero; NTTARGETFILE1 happens on pass two. Both of these macros cause the Build Utility to change directories to a specified subdirectory and run nmake on pass zero or pass two where it might not ordinarily do that.

NTTARGETFILES: If you have unique rules in your subdirectory, you can set up this  macro to key the build process to say that along with your sources file in the subdirectory, you also have a makefile.inc, which can include extra dependencies, extra command line rules, anything you want to build.

O: Use this macro to specify the final objects subdirectory. Define this macro in your sources file or in makefile.inc to be certain something goes into the object subdirectory. The benefit of using $(O) is that any files that you have built and placed in the objects subdirectory will be deleted on the next clean build. This guarantees that no collisions will occur between two builds running on the same machine at the same time. They will never override each others files if you follow the convention that everything you build goes in $(O).

OAK_INC_PATH: Use this macro to specify the path to the OEM Adapter Kit headers. The default is $(BASEDIR)\public\oak\inc

PASS0_SERVER_DIR
PASS0_CLIENT_DIR
PASS0_HEADER_DIR
PASS0_SOURCE_DIR:
Use the PASS0_CLIENTDIR, PASS0_HEADERDIR, and PASS0_SERVERDIR macros to specify where to put the output from MC and MIDL. When you have IDLs in your SOURCES macro and run MIDL, you generate a server part, a client part, a header, and a default source. You specify where you want to put that output in PASS0_SERVERDIR and PASS0_CLIENTDIR. When you run MC, you create a header and a source file. You put that in PASS0_HEADERDIR and PASS0_SOURCEDIR to specify where you want to put that output.

PNP_POWER: This macro is used internally by the Windows NT build group to specify plug-and-play power definition. This enables you to build a driver that understands plug and play.

PRECOMPILED_CXX: Use this macro to indicate whether the precompiled header you are building will be used with C files or with C++ files. The default is to use precompiled headers with C. Therefore, to use the precompiled header with C, do not set PRECOMPILED_CXX at all.

PRECOMPILED_INCLUDE: Use this macro to specify the name of the precompiled header. If you omit this, you will not be able to have precompiled headers. PRECOMPILED_INCLUDE is what triggers the build process to understand that you do have precompiled headers.

PRECOMPILED_OBJ
PRECOMPILED_TARGET:
By default, the Build Utility takes the precompiled header with the precompiled #include setting that you specify, for example precomp.h. Out of that, it creates precomp.obj for the precompiled object and precomp.pch for the precompiled target. You can override those names by setting the PRECOMPILED_OBJ and PRECOMPILED_TARGET macros.

RC_COMPILER: Use this macro to override the RC compiler to be used (e.g. in different code pages for building Asian language builds).

SDK_INC_PATH: Use this macro to specify the path to the SDK headers. The default is $(BASEDIR)\public\sdk\inc

SDK_LIB_PATH: Use this macro to specify the path to the SDK libraries. The default is $(BASEDIR)\public\sdk\lib\*

SOURCES_USED: Use this macro to indicate that another sources file or makefile exists elsewhere in the tree when that file has things in it that your build is dependent upon.

SUBSYSTEM_VERSION: f your product has to run on Windows NT 3.1 or 3.5 or 3.51, set your subsystem version to that version number.

TARGET_CPP: Use this macro to specify name of the compiler.

TARGET_DIRECTORY: Use this macro as follows to specify the target directory when you want some dependency file to always end up in the obj subdirectory.

TARGETLIBS: Use TARGETLIBS to specify other libraries that you want to link against when building your image. It should be your primary method for specifying libraries or objects you want to link against to build your image.

TARGETPATHLIB: Use the TARGETPATHLIB macro to specify where to put the import library when you are building a DLL.

UMBASE: You only need to use this macro when you are building a dynamic link library (a DLL). Use it to set the base address for the DLL image you are creating. If you do not specify an address, the Build Utility will assume that the target name in coffbase.txt is the name of your image. You can override this default target name by specifying a target name with the DLLBASE macro. You can set DLLBASE to be the hard-coded base address, a hex address, or you can leave it blank. If you leave it blank, the Build Utility will always look up the target name specified in coffbase.txt.

UMENTRY: Use this macro to override the default entry point (mainCRTStartup) and specify the entry point depending on the UM Type. You can set this name to be anything you choose. If the UM Type is Windows or Console, the default entry point is main and you can override it with winmain, wmain, or wwinmain.

UMENTRYABS: Use this macro to specify an absolute entry point. For example, you might specify UMENTRY=main, but the real entry point is mainCRTStartup. If you do not want mainCRTStartup to be the entry point, specify UMENTRYABS to make main the absolute entry point. This prevents the Build Utility from going through the translation table that says if it’s main, make it mainCRTStartup.

UMTEST: Use this macro to list source filenames containing a main function. Type these filenames without filename extensions and separate them with an asterisk.

UMTYPE: Use this macro to specify the type of product being built.

windows Win32 Program
nt Native Kernel-mode System Windows NT Program
ntss Windows NT Subsystem
os2 OS/2 Program
posix POSIX Program
console Win32 Console Program

USE_(RUNTIMELIB): Depending on how you want to link your image, you can link runtime libraries from the DLL for Windows NT.

USE_CRTDLL Multi-threaded runtime in a DLL
USE_MSVCRT Multi-threaded runtime in a DLL
USE_LIBCMT Multi-threaded static
USE_LIBCNTPR Kernel
USE_NTDLL The DLL for Windows NT
USE_NOLIBS None

USE_INCREMENTAL_LINKING: Use this macro to direct the Build Utility to use incremental linking.

USE_MFCUNICODE: Use this macro in your sources file to indicate you are using Unicode MFC. This will establish the correct build environment for a program that needs to use Unicode MFC. The Build Utility supports either MFC3 or MFC4.

USE_MFC: Use this macro in your sources file to indicate you are using MFC.

USE_MFC30: Use this macro to direct the Build Utility to use MFC version 3.0 headers and libraries when building. You must list this with USE_MFC or USE_MFCUNICODE to be effective.

USE_NATIVE_EH: If you are using Try Catch and Throw, the standard C++ exception handling (C++EH), you must set this macro to 1.

USE_PDB: If you want a VC4 PDB for your debug symbolic files, set this macro to 1.

USE_STATIC_MFC: The Build Utility always uses MFC in a DLL, but there are a number of cases where thats not available. In those cases, specify USE_STATIC_MFC in your sources file. Then the Build Utility will link MFC into your program statically instead of dynamically loading it from a DLL. Your program will get bigger, but you can do more things.

USECXX_FLAG: This macro enables you to go to a subdirectory that has all C files and compile them with the C++ compiler rather than the C compiler. One reason for doing that might be switching to C++. Rather than change all your filenames to be a.cpp, b.cpp, which is a lot of work for no real gain, you can just specify USECXX_FLAG=/Tp

USER_C_FLAGS: Use this macro to specify flags that only go to the C/C++ compiler. Unlike C_DEFINES,  USER_C_FLAGS doesnt go to the RC compiler.

USER_INCLUDES: You will usually use the INCLUDES macro in your sources file to indicate to the Build Utility where to find the headers that you are including in your build. But there are times when some header files may not exist. Maybe they are built as part of the build process. Specify those header files in USER_INCLUDES to notify the Build Utility, “Here’s another place to go, but do not worry if you cannot find these header files because there may not be anything there.”

NT 4: Razzle Build Environment Build Utility

Razzle is the build environment internally used by Microsoft. It uses the Microsoft Build Utility (build.exe) to build all components of the system.

The Build Utility (build.exe) was created by Steve Wood at Microsoft to establish a unified build environment for the NT Group and other associated projects. Prior to the use of the Build Utility, the build process was solely based on the Makefile and this resulted in the significant variation in the build process across different projects; thereby, making it hard to troubleshoot build issues.

The Build Utility solves this problem by defining the exact project build definition structure (dirs and sources file, explained later in this article). Aside from the build process unification, the Build Utility also offers the following benefits over the conventional Makefile:

  • Automatic Dependency Computation: the dependencies of each source file and different projects are automatically computed and the build process proceeds accordingly
  • Multithreaded Build: the Build Utility is able to build multiple projects at once based on the computed dependency
  • Architecture Independent Project Specification: Unless processor architecture-specific features are present, all projects can be built on any architecture without any changes to the build definition

The following is the help output of the Build Utility:

>build -?
BUILD: Using 10 child processes

BUILD: Version 4.02.1381

Usage: BUILD [-?] display this message
        [-b] displays full error message text (doesn't truncate)
        [-c] deletes all object files
        [-C] deletes all .lib files only
        [-e] generates build.log, build.wrn & build.err files
        [-E] always keep the log/wrn/err files (use with -z)
        [-f] force rescan of all source and include files
        [-F] when displaying errors/warnings to stdout, print the full path
        [-i] ignore extraneous compiler warning messages
        [-k] keep (don't delete) out-of-date targets
        [-l] link only, no compiles
        [-L] compile only, no link phase
        [-m] run build in the idle priority class
        [-M [n]] Multiprocessor build (for MP machines)
        [-o] display out-of-date files
        [-O] generate obj\_objects.mac file for current directory
        [-p] pause' before compile and link phases
        [-P] Print elapsed time after every directory
        [-q] query only, don't run NMAKE
        [-r dirPath] restarts clean build at specified directory path
        [-s] display status line at top of display
        [-S] display status line with include file line counts
        [-t] display the first level of the dependency tree
        [-T] display the complete dependency tree
        [-$] display the complete dependency tree hierarchically
        [-u] display unused BUILD_OPTIONS
        [-v] enable include file version checking
        [-w] show warnings on screen
        [-y] show files scanned
        [-z] no dependency checking or scanning of source files -
                one pass compile/link
        [-Z] no dependency checking or scanning of source files -
                two passes
        [-why] list reasons for building targets

        [-all] same as -386, -mips, -alpha and -ppc
        [-alpha] build targets for alpha
        [-mips] build targets for mips
        [-386] build targets for i386
        [-ppc] build targets for PowerPC

        [-x filename] exclude include file from dependency checks
        [-j filename] use 'filename' as the name for log files
        [-nmake arg] argument to pass to NMAKE
        [-clean] equivalent to '-nmake clean'
        Non-switch parameters specify additional source directories
BUILD: Done

In order to initiate a build process, the Build Utility is launched at the target project directory. The utility then searches for the build process definition files: dirs and sources files. Once the utility reaches a sources file, it performs a dependency analysis and creates _objects.mac file under the output directory, which lists object file dependencies for the component build. After this, the utility invokes the make utility (NMAKE by default), and this performs the operations defined in the OAK (OEM Adaptation Kit) makefile.def.

The dirs file defines the list of sub-directories to be built. The following is an example of dirs file:

DIRS=up

OPTIONAL_DIRS=mp

There are two parameters defined in the dirs file: DIRS and OPTIONAL_DIRS. The DIRS parameter lists all sub-directories to be built when the Build Utility is invoked and the OPTIONAL_DIRS parameter lists the sub-directories to be built only when specifically instructed to do so. In order to build the OPTIONAL_DIRS, the target optional directory name must be added to the argument of the Build Utility. Each sub-directory may contain an additional dirs file if it is a project root directory, or sources file if it is a project/component directory.

The sources file defines the details about the files to be built in the directory. The following is an example of the sources file:

MAJORCOMP=windows
MINORCOMP=cmd

TARGETNAME=cmd
TARGETPATH=obj
TARGETTYPE=PROGRAM

INCLUDES=..;..\..\inc

SOURCES=..\cmd.rc \
        ..\cmd.c

UMTYPE=console
UMLIBS=$(BASEDIR)\public\sdk\lib\*\user32.lib       \
       $(BASEDIR)\public\sdk\lib\*\shell32.lib      \
       $(BASEDIR)\public\sdk\lib\*\advapi32.lib

The MAJORCOMP and MINORCOMP parameters specify the full file name to be used during build. The file name is constructed by concatenating the two parameters. This feature is only used by ALPHA and MIPS build process and is not important for x86 builds.

The TARGETNAME and TARGETPATH parameters specify the name and the path of the target file to be built, respectively. The TARGETTYPE parameter specify the extension name of the target file as well as the build process to be used. In this case, the TARGETTYPE of PROGRAM will produce an executable (exe) file.

The INCLUDES parameter list all header include paths used by the component. This parameter is passed on to the C compiler and also used internally by the Build Utility for computing dependencies.

The SOURCES file list all source files to be built. This is not restricted to C source only, and may include resource (.rc) and assembly (.asm, .s) source files as well.

The UMTYPE parameter specifies the type of component to be built. In this case, the type is console: Win32 Console program. The UMLIBS parameter lists all files to be linked. This includes dynamic and static library (.lib) and compiled resource (.res) files.

Refer to the NT 4: Razzle Build Environment – Build Utility Reference article for the full environment variable and macro definitions.

System/14: Breadboard Prototype Address Decoding Logic

In this article, I will briefly discuss how the address decoding logic of my prototype breadboard implementation works. The purpose of this article is to provide you with a general idea on how the address decoding logic is designed and implemented in an 8086 system. This article does not represent the full implementation of the system address decoding logic compatible to the IBM PC/XT.

The 8086 processor has two separate address spaces: memory and I/O. The address decoding logic must decode the addresses for each address space individually.

The following is the memory map of the breadboard prototype:
Memory Address Space (Linear)   |  I/O Address Space
00000-EFFFF   Unassigned        |  0000-0000   Unassigned
F0000-FFFFF   Firmware ROM      |  0060-006F   8255 Programmable Peripheral Interface
-                               |  0070-FFFF   Unassigned

AT29C256 Pinout

The Firmware ROM device used in our breadboard prototype is AT29C256, which is the direct pin-compatible E2PROM version of the 27C256 EPROM. Since AT29C256 is a 32K x 8 bit (256Kb) device, the data bus of 8086 is 16-bit wide and we are planning to fill a 64KiB address space allocated to the firmware ROM, two AT29C256 connected as a bank will be used to implement the firmware ROM space.

The 8086 operates all even address accesses on D[7:0] and odd address accesses on D[15:8] regardless of the size of data being accessed. For this reason, one ROM is directly connected to the D[7:0] to handle all even addresses while another ROM is directly connected to the D[15:0] to handle all odd addresses. Also the ROM A[14:0] address pins are interfaced to A[15:1] as the lowest address bit is not applicable due to even/odd ROM bank configuration.

Breadboard ROM Connection

The \CE signals of the two ROMs are connected to the memory address decoding logic to enable appropriate ROM or ROMs based on the type of bus cycle. The \CE signal of each ROM is derived from the \BHE and A0 signals as well as the entire address bus, which is used to determine if the address being accessed falls within the ROM address range.

The 8086 processor uses the following combinations of A0 and \BHE to indicate the type and size of data being accessed on the bus:
| A0 \BHE  Size  Description              |
|  0   0    16   Whole word on D[15:0]    |
|  0   1    8    Even address on D[7:0]   |
|  1   0    8    Odd address on D[15:8]   |
|  1   1    -    None                     |

According to the table above, ROM 1 will be enabled when \BHE is 1 (inactive) and A0 is 0, ROM 2 will be enabled when \BHE is 0 (active) and A0 is 1, both ROM 1 and ROM 2 will be enabled when \BHE is 0 (active) and A0 is 0.

The \MRDC signal, indicating a memory read cycle, from the 8288 bus controller is directly connected to all memory devices on the bus so that they can carry out a read cycle when their \CE is active.

 

Breadboard Memory Address Decoding Logic

The schematic on the left shows the implementation of the memory address decoding logic for the firmware ROMs with 7400 series logic gates. The address range detection logic can be implemented in an extremely simple manner with just one quad-input NAND gate.

Since the firmware ROM is allocated to F0000-FFFFF range, the highest four bits of the address bus (A[19:16]) will be 1 if, and only if, the access address is 0xFxxxx. In turn, the NAND gate will output 0 only if all four A[19:16] bits are 1.

This address range detection signal is fed into NAND gates through inverters (note that I put two separate inverters to ensure signal integrity, although one inverter should have been able to drive two NAND inputs at the same time). The two NAND gates, leading to the Chip Select signal for each ROM, is input with both the address range detection output from the earlier NAND stage and A0 and \BHE, respectively, to determine which ROM or ROMs should be selected for the current bus cycle.

As a quick walk-through, for an even address access (A0 = 0, \BHE = 1), given that the address range detection signal (I will call this signal ADDRRANGE from now on) is active (quad-NAND output is 0), the SROM1_CS NAND is input with 1 (\A0) and 1 (\ADDRRANGE) to output 0. Note that SROM1_CS is in fact \SROM1_CS (active low) and this will activate the ROM 1, which is intended for even addresses. Meanwhile, SROM2_CS NAND gate will stay inactive as it is input with \BHE, which is 1 at this time. The same exact pattern applies to accessing an odd address. In a similar way, when the bus is accessing a full 16-bit word (A0 = 0, \BHE = 0), both \SROM1_CS and \SROM2_CS will be activated as both A0 and \BHE are 0.

Moving on to the I/O address space decoding logic for 8255 programmable peripheral interface, the idea is the same except that there are a few differences. Amongst all, the biggest difference is that the 8255 PPI is an 8-bit device and cannot be banked to fit the 16-bit bus of 8086. Since the 8086 processor carries out all even address operations on D[7:0] and odd address operations on D[15:8] regardless of the access size (that is, 8086 still uses D[15:8] instead of D[7:0] on odd addresses even if you are performing an 8-bit I/O), the access to 8255 will have to performed in such manner that all its register addresses are aligned to either even or odd addresses. For example, if 8255 is connected to the D[7:0] and its base address is 060, all its register accesses will be on 060, 062, 064 and so on. Similarly, if 8255 is connected to the D[15:8] and its base address is 060, its register accesses will be on 061, 063, 065 and so on.

This is not a major issue if it is not necessary to linearise the I/O address space. It is, in fact, possible to assign one device to all even addresses and assign another device to all odd addresses on the same base (one device is connected to D[7:0] and another device is connected to D[15:8], while the Chip Select logic handles A0 and \BHE accordingly). However, this is an issue if it is necessary to replicate a pre-defined address space where devices are assigned linearly (e.g. implementing the PC/XT address space- note that XT used the 8088 which had an 8-bit data bus and this was not an issue). In order to linearise the address space for an 8-bit device, an additional 8286 transceiver that connects the processor local bus AD[15:8] to the system bus A[7:0] during a linear 8-bit access is required. The details of this implementation is not applicable nor relevant at this phase and will be discussed in the future articles.

Breadboard IO Address Decoding LogicDue to the address range requirement for the 8255 PPI (060-0x6F), the I/O address decoding logic is a bit more complex than the memory address decoding logic in this case. In short, the section left to the 74LS138 decoder ensures that the A[15:7] is zero (thereby, ensuring that no upper addresses containing 0xXX00-0xXX7F enables the 138 decoder) and the 138 decoder decodes the A[6:4] into individual Chip Select signals for the devices within the I/O address range of 0000-007F, separated by every 16 byte range.

This was a quick overview on how the basic 8086 address decoding logic works. The future implementations may use PAL or PLD for this purpose to reduce the chip count and thereby the board space.

System/14: 8086 Central Processing Unit

The 8086 central processing unit is the core of the System/14 project. It is possibly one of the most influential processors ever to be commercialised as it became the basis for all modern PCs.

Intel 8086 Pinout8086 features a 20-bit address bus and a 16-bit data bus. Due to the limited number of pins available on standard DIP packages, 8086 uses the time multiplexed bus technique that allows same physical pins to be shared by both address and data buses.

The 8086 processor can be configured to operate in two different modes: minimum and maximum. In MIN mode, the 8086 outputs all bus control signals (e.g. WR, DT/R, DEN, ALE) and no external bus controller is required to interface 8086 to the bus. In MAX mode, the 8288 bus controller is used to decode S[2:0] signals from the 8086 into discrete bus control signals.

Note that this configuration is permanent and must be chosen prior to the schematic design. In the System/14 design, we will use the MAX mode as MIN mode does not allow the use of coprocessors (note that RQ/GTn and LOCK pins for multi bus master operation are not available in MIN mode). The processor mode is selected by hard-wiring MN/\MX pin HIGH for MIN mode and LOW for MAX mode.

CLK/RESET/READY pins are directly connected to the corresponding pins on the 8284A clock generator and S0-S2 pins are connected to the 8288 bus controller for further decoding of the bus control signals. \TEST pin is hard-wired LOW for now as no coprocessor is present and the processor will pause execution on WAIT instruction, otherwise. NMI/INTR pins should also be hard-wired LOW for now to prevent unnecessary interrupt generation. Besides the AD bus and BHE pins, rest of the pins should be left unconnected.

The AD[15:0] and AD[19:16] local bus pins are connected to the 8282 address latches and 8286 transceivers to interface the processor local bus to the system bus.

8086 Bus Timing

8086 processor bus cycle consists of at least four processor clock cycles and are divided as follows:
T1: Address emitted, Address latched (ALE), Transceiver mode set (DT/\R)
T2: Status emitted, Control commands issued (\RD, \WR, \DEN), Data emitted (if write cycle)
T3/Tw: Wait until data is available/accepted
T4: Deactivate control commands and bus signals

The 8086 processor provides a time multiplexed. In order to demultiplex the address and data bus signals for the system bus, T1 bus cycle is used to emit the access address from the processor and latch it onto the 8282 address latches.

For a read access, T2/T3/Tw bus cycles are used to allow the addressed device to output the data. Note that the \RD command (\MEMR and \IORC in max. mode) is issued during the T2 cycle and the data is expected to be available during the T3 cycle. The addressed device is expected to hold the 8284 RDYn lines, which in turn is synchronised to the READY signal, low at the time of the issue of the \RD command (though, this is not absolutely necessary if the device is guaranteed to be able to respond before the end of the T3 bus cycle). If the addressed device cannot emit the data before the end of the T3 cycle, it will keep holding the RDYn line low until the data is available and the 8086 processor will insert additional Tw (wait) cycles.

For a write access, the overall process is the same as the read access; however, DEN\ is issued prior to the T2 bus cycle and the data is output immediately after the address during the T2 cycle after the \WR command issue.

System/14: New Breadboard Implementation Preview

With the arrival of the AT29C256 (E2PROM version of 27C256) ROMs and some time off from work, I decided to get back to working on the System/14 project.

Since my previous breadboard implementation was way too messy and getting out of control due to the 20-bit address and 16-bit data buses of the 8086, I spent last two days crafting the new breadboard implementation from the scratch.

New Breadboard Implementation of System/14As you can see, this wires cut-to-length version is a lot neater than the previous pre-made wire version. It is no longer necessary for me to dig into the wall of wires to insert the test probe.

For the two ROM chips (AT29C256) on the right side, I simply used the ugly pre-made wire method to save some time as I really didnt feel like wiring additional batches of address and data buses.

The following is the list of ICs on the breadboard: (from the top)
Column 1: 8284A, 8086-2, 8288, 74LS00
Column 2: (3x) 8282, (2x) 8286
Column 3: 74LS00, 74LS04, 74LS20, 74LS02, 74LS08
Column 4: 74HCT138
Column 5: (2x) AT29C256

Disregard the C8255 on the rightmost column as I havent wired it into the bus yet and Im not even sure about its functionality. It certainly is a good eye candy though.

With the exception of the 7400 series logic ICs, all Intel 8xxx series ICs were described in the previous articles. The 7400 ICs added here are for memory and I/O address decoding logic. I will describe them in details in the upcoming articles.

The two ROMs were programmed with the following code:

.Loop:
    mov al, 0xAA
    out 0x60, al
jmp .Loop

Since the system currently has no output device attached, it was necessary for me to devise an easy method to verify the system functionality at circuit level.

I decided that observing activities on the I/O bus (\IORC, \IOWC control signals), which is normally inactive unless specifically referenced by the executed instructions, is the simplest way to verify the system functionality without having any form of an output device.

The following is the oscilloscope capture of the above code successfully executing:

I/O Bus Oscilloscope Capture

CH1 is the \IOWC, and CH2 is the I/O address decoder output for 060-0x6F range.

As expected, the I/O address decoder output is first activated (that is, the 8288 ALE signal is activated and the target I/O address is latched onto the 8282s) and the \IOWC signal is activated shortly after.

This clearly shows that our 8086 is indeed executing the code I burned into the ROMs, and not some gibberish read from the random noise.

 

I will soon be uploading the articles containing all the details about this implementation. For now, PCB fabrication is of the lowest priority as the breadboard will serve most of its originally intended purposes and even better (by allowing easy modifications).

Stay tuned.

CTL122: Firmware Walkthrough (Pt. 5 Keyboard Module)

The Keyboard Module in the CTL122 firmware is responsible for implementing the keyboard translation and command processing logic. The module closely interacts with the USB and PS/2 driver modules to process scancode translations and special key function handling.

The keyboard module provides the following functions:

    void ProcessKbd(void);
    void EnqueueSingleKeyUSBHID(char scanCode);
    void EnqueueSpecialKeyUSBHID(void);

ProcessKbd function is called periodically by the Timer 2 interrupt service routine and performs PS/2-to-USB translation as well as special key handling.

EnqueueSingleKeyUSBHID is used internally by the keyboard module to create and en-queue HID input report structures to the USB module circular FIFO buffer.

EnqueueSpecialKeyUSBHID performs a similar function to that of the EnqueueSingleKeyUSBHID, but only sets the modifier (i.e. leftControl, leftAlt, leftShit ) values of the input report structure.

void EnqueueSingleKeyUSBHID(char scanCode)
{
    KEYBOARD_INPUT_REPORT usbInReport;

    // Enqueue the input report for key down
    memset(&usbInReport, 0, sizeof(usbInReport));
    usbInReport.modifiers.bits.leftControl = Kbd_LCtrlDown;
    usbInReport.modifiers.bits.leftAlt = Kbd_LAltDown;
    usbInReport.modifiers.bits.leftShift = Kbd_LShiftDown;
    usbInReport.modifiers.bits.leftGUI = Kbd_LWinDown;
    usbInReport.modifiers.bits.rightControl = Kbd_RCtrlDown;
    usbInReport.modifiers.bits.rightAlt = Kbd_RAltDown;
    usbInReport.modifiers.bits.rightShift = Kbd_RShiftDown;
    usbInReport.keys[0] = scanCode;
    EnqueueTxUSBHID(&usbInReport);

    // Enqueue the input report for key up
    usbInReport.keys[0] = 0;
    EnqueueTxUSBHID(&usbInReport);
}

As described above, a local USB HID input report structure is declared and zeroed, and then set with all modifier key values (Kbd_*Down variables are locally declared in the keyboard module and keeps track of the modifier key states). The keys[0] value of the input report is then set to the scanCode parameter provided to the function.

Note that the scancode set 3 used by the 122-key Model M keyboards only provides key-up sequences to the special (aka. modifier) keys- all the other keys do not generate key-up sequences as they do in the scancode set 1. This is why the USB module input report enqueue function is called twice- first with the target scancode and second without one. This registers on the host computer operating system as one KEYDOWN and KEYUP.

void EnqueueSpecialKeyUSBHID(void)
{
    KEYBOARD_INPUT_REPORT usbInReport;

    // Enqueue the input report for key down
    memset(&usbInReport, 0, sizeof(usbInReport));
    usbInReport.modifiers.bits.leftControl = Kbd_LCtrlDown;
    usbInReport.modifiers.bits.leftAlt = Kbd_LAltDown;
    usbInReport.modifiers.bits.leftShift = Kbd_LShiftDown;
    usbInReport.modifiers.bits.leftGUI = Kbd_LWinDown;
    usbInReport.modifiers.bits.rightControl = Kbd_RCtrlDown;
    usbInReport.modifiers.bits.rightAlt = Kbd_RAltDown;
    usbInReport.modifiers.bits.rightShift = Kbd_RShiftDown;
    EnqueueTxUSBHID(&usbInReport);
}

The operation of the EnqueueSpecialKeyUSBHID function is identical to the EnqueueSingleKeyUSBHID function except that it does not set the keys value of the input report structure. It also calls EnqueueTxUSBHID only once because the scancode set 3 provides key-up sequences for the special (modifier) keys.

const char Kbd_TransTbl[256] = {
//  0     1     2     3     4     5     6     7
    0x00, 0x00, 0x00, 0x00, 0x00, 0x29, 0x00, 0x3A,   // 0
    0x68, 0x00, 0x00, 0x00, 0x00, 0x2B, 0x35, 0x3B,   // 1
    0x69, 0x00, 0x00, 0x64, 0x00, 0x14, 0x1E, 0x3C,   // 2
    0x6A, 0x00, 0x1D, 0x16, 0x04, 0x1A, 0x1F, 0x3D,   // 3
    0x6B, 0x06, 0x1B, 0x07, 0x08, 0x21, 0x20, 0x3E,   // 4
    0x6C, 0x2C, 0x19, 0x09, 0x17, 0x15, 0x22, 0x3F,   // 5
    0x6D, 0x11, 0x05, 0x0B, 0x0A, 0x1C, 0x23, 0x40,   // 6
    0x6E, 0x00, 0x10, 0x0D, 0x18, 0x24, 0x25, 0x41,   // 7
    0x6F, 0x36, 0x0E, 0x0C, 0x12, 0x27, 0x26, 0x42,   // 8
    0x70, 0x37, 0x38, 0x0F, 0x33, 0x13, 0x2D, 0x43,   // 9
    0x71, 0x00, 0x34, 0x31, 0x2F, 0x2E, 0x44, 0x72,   // 10
    0x00, 0x00, 0x28, 0x30, 0x00, 0x00, 0x45, 0x73,   // 11
    0x51, 0x50, 0x00, 0x52, 0x4C, 0x4D, 0x2A, 0x49,   // 12
    0x00, 0x59, 0x4F, 0x5C, 0x5F, 0x4E, 0x4A, 0x4B,   // 13
    0x62, 0x63, 0x5A, 0x5D, 0x5E, 0x60, 0x53, 0x54,   // 14
    0x00, 0x58, 0x5B, 0x00, 0x57, 0x61, 0x55, 0x00,   // 15
    0x00, 0x00, 0x00, 0x00, 0x56, 0x00, 0x00, 0x00,   // 16
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 17
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 18
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 19
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 20
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 21
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 22
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 23
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 24
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 25
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 26
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 27
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 28
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 29
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 30
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00    // 31
};

Kbd_TransTbl is the PS/2 Scancode Set 3 to USB HID Scancode translation table. The index in the array represents a PS/2 scancode and the value of the array element at an index represents the corresponding USB HID scancode. For example, the scancode for A key in the PS/2 scancode set 3 is 0x1C; the value of Kbd_TransTbl[0x1C] is 004, which corresponds to the USB HID scancode for A key.

Before I get into the ProcessKbd function, I will briefly list the operation modes of the Keyboard Module:

KBD_MODE_THRU: Through Mode provides direct PS/2-to-USB scancode translation. This mode is set by default after firmware initialisation and allows the 122-key Model M keyboard to function transparently.
KBD_MODE_WAITBREAK: This mode is not necessarily a mode of operation, but more of a state of the keyboard module state machine. It is entered when key break sequence (to be explained later) is read from the PS/2 queue.
KBD_MODE_WIN: Windows Function Mode provides special Windows (as in Microsoft operating system) key combination handling.
KBD_MODE_CFG: Configuration Mode allows the user to modify keyboard operations. For now, it doesnt really do that much, but it can potentially be used to perform macro key programming and various other configuration tasks.
KBD_MODE_MEM: Memory/Macro Mode simulates multiple key press sequences.

void ProcessKbd(void)
{
    char ps2ScanCode;

    // If PS/2 receive queue is not empty
    if (GetRxCountPS2() > 0)
    {
        // Dequeue PS/2 scancode
        ps2ScanCode = DequeueRxPS2();
...

ProcessKbd function first checks if there is any key sequence to process (i.e. if PS/2 buffer is not empty). If there exists pending key sequences, it de-queues one key sequence from the PS/2 circular FIFO buffer and stores it into ps2ScanCode local variable.

From here, processing depends on the value of the Kbd_Mode variable, which stores the active processing mode of the keyboard module. We will first take a look at the KBD_MODE_THRU, aka. Through Mode.

        if (Kbd_Mode == KBD_MODE_THRU)
        {
            switch (ps2ScanCode)
            {
                /******************************************************************/
                /* Special Keys                                                   */
                /******************************************************************/
                case 0x11: // LCtrl
                    Kbd_LCtrlDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x19: // LAlt
                    Kbd_LAltDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x12: // LShift
                    Kbd_LShiftDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x58: // RCtrl
                    Kbd_RCtrlDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x39: // RAlt
                    Kbd_RAltDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x59: // RShift
                    Kbd_RShiftDown = TRUE; EnqueueSpecialKeyUSBHID(); break;
                case 0x14: // CapsLk
                    EnqueueSingleKeyUSBHID(0x39); break;
                case 0x06: // WinKey
                    if (Kbd_LCtrlDown == TRUE)
                    {
                        Kbd_LCtrlDown = FALSE;
                        SetUIMessage("W I N   ");
                        Kbd_Mode = KBD_MODE_WIN;
                        Kbd_SubMode = KBD_SUBMODE_WIN_WIN;
                    }
                    else
                    {
                        Kbd_LWinDown = TRUE;
                        EnqueueSpecialKeyUSBHID();
                        Kbd_LWinDown = FALSE;
                        EnqueueSpecialKeyUSBHID();
                    }
                    break;
                case 0x83: // Explorer
                    Kbd_LWinDown = TRUE;
                    EnqueueSingleKeyUSBHID(0x08);
                    Kbd_LWinDown = FALSE;
                    EnqueueSpecialKeyUSBHID();
                    break;
                case 0x0A: // Exec
                    Kbd_LWinDown = TRUE;
                    EnqueueSingleKeyUSBHID(0x15);
                    Kbd_LWinDown = FALSE;
                    EnqueueSpecialKeyUSBHID();
                    break;
                case 0x01: // ConfigKey
                    SetUIMessage("C F G   ");
                    Kbd_Mode = KBD_MODE_CFG;
                    break;
                case 0x09: // MemKey
                    SetUIMessage("M E M   ");
                    Kbd_Mode = KBD_MODE_MEM;
                    break;
                case 0xF0: // (Key Break)
                    Kbd_Mode = KBD_MODE_WAITBREAK; break;
...

The THRU mode processing routine checks if any special (modifier) key is pressed. If a special key is pressed, the corresponding local key state variable is set to TRUE, meaning the key is in DOWN state. After the key state variable is set, EnqueueSpecialKeyUSBHID function is called to register the special key press to the host computer.

In addition to the modifier special keys, the processing routine also includes mode activation and short cut keys. For example, the key 006 (I assigned this key to be the WinKey) generates an actual WinKey down/up sequences or enters the KBD_MODE_WIN if left control key is in DOWN state (Kbd_LCtrlDown == TRUE). The key 083 (Explorer) simulates WIN+E (Windows Explorer) key press and 0x0A (Exec) simulates WIN+R (Run) key press. These two are my most frequently used keys and I decided that they are worth having as dedicated keys.

IBM Model M 122-key Keyboard ScancodeThe key sequence 0xF0 is not an actual key, but a key break command. Once this sequence is read, the Kbd_Mode is set to KBD_MODE_WAITBREAK and the corresponding key-up (break) sequence is processed during the next call to ProcessKbd function. Note that only special (modifier) keys generate key-up sequences in the PS/2 Scancode Set 3.

The following is the KBD_MODE_WAITBREAK processing routine:

        else if (Kbd_Mode == KBD_MODE_WAITBREAK)
        {
            switch (ps2ScanCode)
            {
                case 0x11: // LCtrl
                    Kbd_LCtrlDown = FALSE; break;
                case 0x19: // LAlt
                    Kbd_LAltDown = FALSE; break;
                case 0x12: // LShift
                    Kbd_LShiftDown = FALSE; break;
                case 0x58: // RCtrl
                    Kbd_RCtrlDown = FALSE; break;
                case 0x39: // RAlt
                    Kbd_RAltDown = FALSE; break;
                case 0x59: // RShift
                    Kbd_RShiftDown = FALSE; break;
                case 0x14: // CapsLk
                    break;
            }
            Kbd_Mode = KBD_MODE_THRU;
            EnqueueSpecialKeyUSBHID();
        }

The routine sets the corresponding key state to FALSE (that is, key up) and calls EnqueueSpecialKeyUSBHID to register the updated key state. In addition to the key state changes, Kbd_Mode is also reset to KBD_MODE_THRU. As previously mentioned, KBD_MODE_WAITBREAK is not an actual mode of operation, but a state of the keyboard module state machine.

Back to the KBD_MODE_THRU processing routine- if the value of ps2ScanCode corresponds to none of the above, the following routine is executed:

                default:
                    if (Kbd_Config_F1324WinProgCombo == TRUE)
                    {
            /******************************************************************/
            /* F13-F24 Windows Program Combo Processing                       */
            /******************************************************************/
                        Kbd_LWinDown = TRUE;
                        switch (ps2ScanCode)
                        {
                            case 0x08: // F13 (WinKey+1)
                                EnqueueSingleKeyUSBHID(0x1E); break;
                            case 0x10: // F14 (WinKey+2)
                                EnqueueSingleKeyUSBHID(0x1F); break;
                            case 0x18: // F15 (WinKey+3)
                                EnqueueSingleKeyUSBHID(0x20); break;
                            case 0x20: // F16 (WinKey+4)
                                EnqueueSingleKeyUSBHID(0x21); break;
                            case 0x28: // F17 (WinKey+5)
                                EnqueueSingleKeyUSBHID(0x22); break;
                            case 0x30: // F18 (WinKey+6)
                                EnqueueSingleKeyUSBHID(0x23); break;
                            case 0x38: // F19 (WinKey+7)
                                EnqueueSingleKeyUSBHID(0x24); break;
                            case 0x40: // F20 (WinKey+8)
                                EnqueueSingleKeyUSBHID(0x25); break;
                            case 0x48: // F21 (WinKey+9)
                                EnqueueSingleKeyUSBHID(0x26); break;
                            case 0x50: // F22 (WinKey+0)
                                EnqueueSingleKeyUSBHID(0x27); break;
                            default:
                                Kbd_LWinDown = FALSE;
                                goto ProcessStandardTranslation;
                        }
                        Kbd_LWinDown = FALSE;
                        EnqueueSpecialKeyUSBHID();
                    }
                    else
                    {
            /******************************************************************/
            /* Standard Translation                                           */
            /******************************************************************/
                    ProcessStandardTranslation:
                        EnqueueSingleKeyUSBHID(
                                Kbd_TransTbl[ps2ScanCode]
                                );
                    }
                    break;
            }
        }

Lets ignore if (Kbd_Config_F1324WinProgCombo == TRUE) section for now and focus on else part. If the value of ps2ScanCode corresponds to none of the previously referred key scancodes (that is, the pressed key is not special/modifier keys nor key break sequence), the Kbd_TransTbl is used to look up the corresponding USB HID scancode for the PS/2 scancode and the translated scancode is en-queued to the USB circular FIFO buffer; thereby, allowing transparent key-to-key translation.

Now, Kbd_Config_F1324WinProgCombo is a feature controlled by the KBD_MODE_CFG that allows F13 to F24 keys to be used as Win+(NUMBER) key combination in KBD_MODE_THRU. This allows F13-F22 to be used to select a task on the Windows taskbar. F23 and F24 are not actually used as Windows (as far as I know) only supports from Win+1 up to Win+0.

        else if (Kbd_Mode == KBD_MODE_CFG)
        {
            switch (ps2ScanCode)
            {
                case 0x1D: // W (F13-F24 Windows Program Combo)
                    Kbd_Config_F1324WinProgCombo = !Kbd_Config_F1324WinProgCombo;
                    break;
            }
            Kbd_Mode = KBD_MODE_THRU;
            SetUIMessage("T H R U ");
        }

The above is the KBD_MODE_CFG processing routine. This routine doesnt really do much for now and the sole functionality of it is to provide the on-line setting modification of Kbd_Config_F1324WinProgCombo. This mode can be extended to support further feature configuration and macro programming. Since I couldnt really think of special features to add (or more like, couldnt be bothered to ) and I am not really a gamer that requires various macro functions either, I decided to leave it at this much.

        else if (Kbd_Mode == KBD_MODE_WIN)
        {
            if (ps2ScanCode == 0x05) // Escape
                goto ExitWinMode;
            else if (ps2ScanCode == 0x06) // WinKey Repeat
            {
                switch (Kbd_SubMode)
                {
                    case KBD_SUBMODE_WIN_WIN:
                        Kbd_LShiftDown = TRUE;
                        SetUIMessage("W I S   ");
                        Kbd_SubMode = KBD_SUBMODE_WIN_WIS;
                        break;
                    case KBD_SUBMODE_WIN_WIS:
                        Kbd_LShiftDown = FALSE;
                        goto ExitWinMode;
                }
            }
            else
            {
                Kbd_LWinDown = TRUE;
                EnqueueSpecialKeyUSBHID();
                switch (ps2ScanCode)
                {
                    case 0x2D: // R (Run)
                        EnqueueSingleKeyUSBHID(0x15); goto ExitWinMode;
                    case 0x24: // E (Explorer)
                        EnqueueSingleKeyUSBHID(0x08); goto ExitWinMode;
                    case 0x23: // D (Desktop)
                        EnqueueSingleKeyUSBHID(0x07); goto ExitWinMode;
                    case 0x07: // F1 (WinKey+1)
                        EnqueueSingleKeyUSBHID(0x1E); break;
                    case 0x0F: // F2 (WinKey+2)
                        EnqueueSingleKeyUSBHID(0x1F); break;
                    case 0x17: // F3 (WinKey+3)
                        EnqueueSingleKeyUSBHID(0x20); break;
                    case 0x1F: // F4 (WinKey+4)
                        EnqueueSingleKeyUSBHID(0x21); break;
                    case 0x27: // F5 (WinKey+5)
                        EnqueueSingleKeyUSBHID(0x22); break;
                    case 0x2F: // F6 (WinKey+6)
                        EnqueueSingleKeyUSBHID(0x23); break;
                    case 0x37: // F7 (WinKey+7)
                        EnqueueSingleKeyUSBHID(0x24); break;
                    case 0x3F: // F8 (WinKey+8)
                        EnqueueSingleKeyUSBHID(0x25); break;
                    case 0x47: // F9 (WinKey+9)
                        EnqueueSingleKeyUSBHID(0x26); break;
                    case 0x4F: // F10 (WinKey+0)
                        EnqueueSingleKeyUSBHID(0x27); break;
                    case 0x5A: // Enter
                        EnqueueSingleKeyUSBHID(0x28); break;
                }
            }
            return;
ExitWinMode:
            Kbd_LWinDown = FALSE;
            Kbd_LShiftDown = FALSE;
            EnqueueSpecialKeyUSBHID();
            Kbd_Mode = KBD_MODE_THRU;
            SetUIMessage("T H R U ");
        }

The KBD_MODE_WIN processing routine has two sub-processing modes: KBD_SUBMODE_WIN_WIN and KBD_SUBMODE_WIN_WIS. The former is entered by default and the latter is entered by pressing Win key after the keyboard module is in KBD_SUBMODE_WIN_WIN mode.

The only difference between the two mode is that the _WIS mode produces the WinKey combos along with Left Shift key pressed. Note that KBD_MODE_WIN mode allows F1 to F10 keys to be used as WinKey+(NUMBER) combination (the function is basically identical to the previously described F13 to F22 feature, but with F1 to F10 instead; in fact, F13-F22 implementation was an afterthought). The WIS submode, which presses the Left Shift key along with the WinKey+(NUMBER) key combinations, allows a new instance launch of the corresponding taskbar task (that is, WinKey+Shift+(NUMBER); if you are not familiar with it, look it up on Google).

The rest of the KBD_MODE_WIN routine should be quite obvious by now and I wont explain any further into it.

        else if (Kbd_Mode == KBD_MODE_MEM)
        {
            switch (ps2ScanCode)
            {
                case 0x07: // F1
                    EnqueueSingleKeyUSBHID(0x04 + 'm' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'a' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'c' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'r' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'o' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 't' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'e' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 's' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 't' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'm' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'e' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 's' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 's' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'a' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'g' - 'a');
                    EnqueueSingleKeyUSBHID(0x04 + 'e' - 'a');
                    break;
                case 0x0F: // F2
                    for (int i = 0; i < 26; i++)
                    {
                        EnqueueSingleKeyUSBHID(0x04 + i);
                        EnqueueSingleKeyUSBHID(0x28);
                    }
                    break;
            }
            Kbd_Mode = KBD_MODE_THRU;
            SetUIMessage("T H R U ");
        }

Finally, we are at the KBD_MODE_MEM processing routine. As I previously mentioned, I am not a very big fan of macros and dont really have a good use for them. This feature was implemented solely as a demonstration of its potential usage.

Once in the memory/macro mode, pressing F1 key will type macrotestmessage and F2 key will type a{ENTER}b{ENTER}z{ENTER}. Note that a will be cut-off due to the limitation of the PIC18F4550 memory (input report circular FIFO buffer cant handle all those inputs). Either way, there shouldnt be a need for one to type that many keys in a single macro.

This concludes the CTL122: Firmware Walkthrough series. The full source code of the firmware is available at the CTL122 project page (put your cursor over Projects item on the top menu bar and click CTL122).

CTL122: Firmware Walkthrough (Pt. 4 PS/2 Communication)

In the part 4 of this series, we will analyse the PS/2 communication routines of the CTL122 firmware. For USB communications, please refer to the Part 2 and Part 3.

The following is the constant and function definitions from PS2.h:

#define _PS2_RXBUFLEN   32

#define PS2_RECV        0
#define PS2_TIMEOUT     1

    void ProcessPS2(int mode);
    void EnqueueRxPS2(char elem);
    char DequeueRxPS2(void);
    unsigned int GetRxCountPS2(void);
    int IsRxFullPS2(void);

PS/2 module implements a queue structure similar to that of the USB module. PS/2 module, however, queues individual byte level transmission on the PS/2 bus, unlike the USB module that queues input reports.

The PS/2 module operates with the on-chip CCP (Capture/Compare/PWM) module to implement bit capture on the PS/2 bus. The following is the CCP1 module initialisation routine from Init.c:

void InitCapture(void)
{
    // Initialise Capture 1
    OpenCapture1(C1_EVERY_FALL_EDGE & CAPTURE_INT_ON);
    // Initialise the Timer 1
    OpenTimer1(TIMER_INT_ON & T1_SOURCE_EXT & T1_PS_1_1 & T1_16BIT_RW);
    WriteTimer1(0x0000);
}

CCP1 module is configured to raise an interrupt whenever the associated pin (PS2_CLK) is on the falling edge. Capture module also initialises the timer module to detect PS/2 bus transmission sequence timeout. Note that the PS/2 protocol is essentially synchronous UART transmission. For details, refer to the Interfacing with PS/2 Devices article.

    // Timer 1 Interrupt
    else if (PIR1bits.TMR1IF == 1)
    {
        ProcessPS2(PS2_TIMEOUT);
        PIR1bits.TMR1IF = 0;
    }

...

    // Capture Module 1 Interrupt
    else if (PIR1bits.CCP1IF == 1)
    {
        ProcessPS2(PS2_RECV);
        PIR1bits.CCP1IF = 0;
    }

The above is an excerpt from Interrupt.c. Note that Timer 1 interrupt calls ProcessPS2 function with PS2_TIMEOUT, which resets the UART receive sequence of the PS/2 module, and Capture Module 1 interrupt calls the function with PS2_RECV.

The following excerpt shows PS/2 module internal variables:

// Receive Sequence
unsigned int PS2_CurRx;
char PS2_CurRxIdx = 0;

// Receive Buffer
char PS2_RxBuf[_PS2_RXBUFLEN];
unsigned int PS2_RxBufIdx = 0;
unsigned int PS2_RxBufCnt = 0;

PS2_CurRx contains the value of the active byte transmission value and PS2_CurRxIdx contains the current bit index in the active byte transmission sequence.

The receive buffer variables are the values required to implement a circular FIFO buffer for PS/2 transmissions.

void ProcessPS2(int mode)
{
    char temp;

    switch (mode)
    {
        // Receive Event
        case PS2_RECV:
            if (PS2_CurRxIdx < 10) // Append bits
            {
                PS2_CurRx <<= 1;
                PS2_CurRx |= PS2_DATA;
                PS2_CurRxIdx++;
            }
            else if (PS2_CurRxIdx == 10) // Stop bit
            {
                // Generate reordered temp. received byte
                temp = SwapBitOrder((char)(PS2_CurRx >> 1));
                // Verify the receive buffer state
                if (IsRxFullPS2() == FALSE)
                {
                    // Enqueue to the receive buffer
                    EnqueueRxPS2(temp);
                }
                else
                {
                    // Print error on 7 segment displays
                    SetUIMessage("P B O F ");
                }
                // Reset the receive sequence index
                PS2_CurRxIdx = 0;
            }
            WriteTimer1(0x0000);
            break;
        // Receive Timeout
        case PS2_TIMEOUT:
            PS2_CurRxIdx = 0;
            break;
    }
}

The ProcessPS2 function has two operating modes: PS2_RECV and PS2_TIMEOUT. As previously mentioned, PS2_RECV is called whenever CCP1 module interrupt is raised (in other words, whenever PS2_CLK pin is on the falling edge) and PS2_TIMEOUT is raised whenever the Timer 1 interrupt is raised.

The PS2_RECV mode initiates/processes/completes the byte transmission sequence on the PS/2 bus.

Note that the PS2_CLK is normally at HIGH logic level and transitions to LOW logic level when a transmission sequence begins (start bit); thereby creating a falling edge which, in turn, triggers the CCP1 module interrupt. This process repeats until the last bit (stop bit) is transmitted by the PS/2 device and PS2_DATA at every falling edge is latched.

            if (PS2_CurRxIdx < 10) // Append bits
            {
                PS2_CurRx <<= 1;
                PS2_CurRx |= PS2_DATA;
                PS2_CurRxIdx++;
            }

Each byte transmission sequence in PS/2 bus is 1 start bit, 8 data bits, 1 parity bit, 1 stop bit. PS2_CurRx is first shifted left by one and ORed with PS2_DATA. This latches PS2_DATA pin logic state to PS2_CurRx LSB. PS2_CurRxIdx is incremented and this process is repeated until the 10th bit (that is, including start, data and parity bits).

            else if (PS2_CurRxIdx == 10) // Stop bit
            {
                // Generate reordered temp. received byte
                temp = SwapBitOrder((char)(PS2_CurRx >> 1));
                // Verify the receive buffer state
                if (IsRxFullPS2() == FALSE)
                {
                    // Enqueue to the receive buffer
                    EnqueueRxPS2(temp);
                }
                else
                {
                    // Print error on 7 segment displays
                    SetUIMessage("P B O F ");
                }
                // Reset the receive sequence index
                PS2_CurRxIdx = 0;
            }

At the 11th bit (stop bit), PS2_CurRx is shifted right by one bit to remove the parity bit (parity bit is not checked) and casted to char to eliminate the start bit. The remaining data bits are bit-order-swapped (MSB<->LSB) and saved to temp. The PS/2 queue state is checked and if the queue is not full, the value of the temp is queued for processing by the keyboard module. Also note that PS2_CurRxIdx is set to 0 as this completes a transmission sequence.

WriteTimer1(0x0000);

At the end of PS2_RECV mode (for both PS2_CurRxIdx < 10 and == 10), the timer 1 counter is reset in order to prevent the receive sequence from timing out. If, during a byte sequence, transmission is interrupted (that is, no further falling edge is detected on the PS2_CLK pin), the timer 1 counter will eventually overflow and raise the timer 1 interrupt (calls ProcessPS2 with PS2_TIMEOUT) which resets the PS2_CurRxIdx.

CTL122: Firmware Walkthrough (Pt. 3 USB Communication)

In this part of CTL122 Firmware Walkthrough, we will take a look at the USB communication part of the firmware code. For USB configurations, please refer to the Firmware Walkthrough Part 2.

CTL122 firmware utilises the Microchip MLA USB library for USB communication. We will first take a look at the USB.h file responsible for defining all constants and structures required for USB operation.

#define KEYBOARD_INPUT_REPORT_DATA_BUFFER_ADDRESS_TAG   @ 0x500
#define KEYBOARD_OUTPUT_REPORT_DATA_BUFFER_ADDRESS_TAG  @ 0x508

The header file first begins with the addresses at which the USB input and output buffers are located. KEYBOARD_INPUT/OUTPUT_REPORT_DATA_BUFFER_ADDRESS_TAG definitions are referred by the USB.c to declare the locations of the report (packet) structures. The following excerpt shows the declaration of the inputReport and outputReport structures in USB.c.

#if !defined(KEYBOARD_INPUT_REPORT_DATA_BUFFER_ADDRESS_TAG)
    #define KEYBOARD_INPUT_REPORT_DATA_BUFFER_ADDRESS_TAG
#endif
KEYBOARD_INPUT_REPORT inputReport KEYBOARD_INPUT_REPORT_DATA_BUFFER_ADDRESS_TAG;

#if !defined(KEYBOARD_OUTPUT_REPORT_DATA_BUFFER_ADDRESS_TAG)
    #define KEYBOARD_OUTPUT_REPORT_DATA_BUFFER_ADDRESS_TAG
#endif
volatile KEYBOARD_OUTPUT_REPORT outputReport KEYBOARD_OUTPUT_REPORT_DATA_BUFFER_ADDRESS_TAG;

The reason for specifying pre-defined addresses for the USB buffers is to ensure that the buffers are allocated in the dual port memory banks (Bank 4 through 7) of the PIC18F4550 microcontroller. Because both the microcontroller core and SIE (Serial Interface Engine) need to access the memory area, it is imperative that the memory bank at which the buffers are located is dual ported.

The following is the definition of the KEYBOARD_INPUT_REPORT structure declared above:

    /* This typedef defines the only INPUT report found in the HID report
     * descriptor and gives an easy way to create the OUTPUT report. */
    typedef struct __attribute__((packed))
    {
        /* The union below represents the first byte of the INPUT report.  It is
         * formed by the following HID report items:
         *
         *  0x19, 0xe0, //   USAGE_MINIMUM (Keyboard LeftControl)
         *  0x29, 0xe7, //   USAGE_MAXIMUM (Keyboard Right GUI)
         *  0x15, 0x00, //   LOGICAL_MINIMUM (0)
         *  0x25, 0x01, //   LOGICAL_MAXIMUM (1)
         *  0x75, 0x01, //   REPORT_SIZE (1)
         *  0x95, 0x08, //   REPORT_COUNT (8)
         *  0x81, 0x02, //   INPUT (Data,Var,Abs)
         *
         * The report size is 1 specifying 1 bit per entry.
         * The report count is 8 specifying there are 8 entries.
         * These entries represent the Usage items between Left Control (the usage
         * minimum) and Right GUI (the usage maximum).
         */
        union __attribute__((packed))
        {
            uint8_t value;
            struct __attribute__((packed))
            {
                unsigned leftControl    :1;
                unsigned leftShift      :1;
                unsigned leftAlt        :1;
                unsigned leftGUI        :1;
                unsigned rightControl   :1;
                unsigned rightShift     :1;
                unsigned rightAlt       :1;
                unsigned rightGUI       :1;
            } bits;
        } modifiers;

        /* There is one byte of constant data/padding that is specified in the
         * input report:
         *
         *  0x95, 0x01,                    //   REPORT_COUNT (1)
         *  0x75, 0x08,                    //   REPORT_SIZE (8)
         *  0x81, 0x03,                    //   INPUT (Cnst,Var,Abs)
         */
        unsigned :8;

        /* The last INPUT item in the INPUT report is an array type.  This array
         * contains an entry for each of the keys that are currently pressed until
         * the array limit, in this case 6 concurent key presses.
         *
         *  0x95, 0x06,                    //   REPORT_COUNT (6)
         *  0x75, 0x08,                    //   REPORT_SIZE (8)
         *  0x15, 0x00,                    //   LOGICAL_MINIMUM (0)
         *  0x25, 0x65,                    //   LOGICAL_MAXIMUM (101)
         *  0x05, 0x07,                    //   USAGE_PAGE (Keyboard)
         *  0x19, 0x00,                    //   USAGE_MINIMUM (Reserved (no event indicated))
         *  0x29, 0x65,                    //   USAGE_MAXIMUM (Keyboard Application)
         *
         * Report count is 6 indicating that the array has 6 total entries.
         * Report size is 8 indicating each entry in the array is one byte.
         * The usage minimum indicates the lowest key value (Reserved/no event)
         * The usage maximum indicates the highest key value (Application button)
         * The logical minimum indicates the remapped value for the usage minimum:
         *   No Event has a logical value of 0.
         * The logical maximum indicates the remapped value for the usage maximum:
         *   Application button has a logical value of 101.
         *
         * In this case the logical min/max match the usage min/max so the logical
         * remapping doesn't actually change the values.
         *
         * To send a report with the 'a' key pressed (usage value of 0x04, logical
         * value in this example of 0x04 as well), then the array input would be the
         * following:
         *
         * LSB [0x04][0x00][0x00][0x00][0x00][0x00] MSB
         *
         * If the 'b' button was then pressed with the 'a' button still held down,
         * the report would then look like this:
         *
         * LSB [0x04][0x05][0x00][0x00][0x00][0x00] MSB
         *
         * If the 'a' button was then released with the 'b' button still held down,
         * the resulting array would be the following:
         *
         * LSB [0x05][0x00][0x00][0x00][0x00][0x00] MSB
         *
         * The 'a' key was removed from the array and all other items in the array
         * were shifted down. */
        uint8_t keys[6];
    } KEYBOARD_INPUT_REPORT;

With all the comments from the MLA reference library, the above code is pretty much self-explanatory. In summary, the above structure defines the USB packet structure for the packets sent from CTL122 to the host computer. Subsequently, the packet structure contains the special key press bit field (the first union) and the array for key scancodes. This is, in fact, where the infamous USB HID keyboard 6 simultaneous key press limit comes from. Note that the key scancode buffer is declared to be keys[6]. This limits the number of simultaneous key presses to 6.

    /* This typedef defines the only OUTPUT report found in the HID report
     * descriptor and gives an easy way to parse the OUTPUT report. */
    typedef union __attribute__((packed))
    {
        /* The OUTPUT report is comprised of only one byte of data. */
        uint8_t value;
        struct
        {
            /* There are two report items that form the one byte of OUTPUT report
             * data.  The first report item defines 5 LED indicators:
             *
             *  0x95, 0x05,                    //   REPORT_COUNT (5)
             *  0x75, 0x01,                    //   REPORT_SIZE (1)
             *  0x05, 0x08,                    //   USAGE_PAGE (LEDs)
             *  0x19, 0x01,                    //   USAGE_MINIMUM (Num Lock)
             *  0x29, 0x05,                    //   USAGE_MAXIMUM (Kana)
             *  0x91, 0x02,                    //   OUTPUT (Data,Var,Abs)
             *
             * The report count indicates there are 5 entries.
             * The report size is 1 indicating each entry is just one bit.
             * These items are located on the LED usage page
             * These items are all of the usages between Num Lock (the usage
             * minimum) and Kana (the usage maximum).
             */
            unsigned numLock        :1;
            unsigned capsLock       :1;
            unsigned scrollLock     :1;
            unsigned compose        :1;
            unsigned kana           :1;

            /* The second OUTPUT report item defines 3 bits of constant data
             * (padding) used to make a complete byte:
             *
             *  0x95, 0x01,                    //   REPORT_COUNT (1)
             *  0x75, 0x03,                    //   REPORT_SIZE (3)
             *  0x91, 0x03,                    //   OUTPUT (Cnst,Var,Abs)
             *
             * Report count of 1 indicates that there is one entry
             * Report size of 3 indicates the entry is 3 bits long. */
            unsigned                :3;
        } leds;
    } KEYBOARD_OUTPUT_REPORT;

The above is the definition of the KEYBOARD_OUTPUT_REPORT structure. This structure represents a USB packet sent from the host computer to CTL122. The only interesting field in this structure would be leds which represents the keyboard LED status (Num Lock, Caps Lock, Scroll Lock) sent by the host computer operating system. This packet is periodically sent by the host computer operating system and allows CTL122 to keep the keyboard LED status up to date.

    /* This creates a storage type for all of the information required to track the
     * current state of the keyboard. */
    typedef struct
    {
        USB_HANDLE lastINTransmission;
        USB_HANDLE lastOUTTransmission;
    } USBHID_CONTEXT;

USBHID_CONTEXT structure contains the handles of the last IN and OUT transmissions. This allows the firmware to keep track whether the last USB IN/OUT transmissions are completed/available or not (busy).

The USB.c/.h, in short, implements a circular FIFO buffer for other modules to queue in the key transmissions.

// Transmit Buffer
KEYBOARD_INPUT_REPORT USBHID_TxBuf[_USBHID_TXBUFLEN];
unsigned int USBHID_TxBufIdx = 0;
unsigned int USBHID_TxBufCnt = 0;

USBHID_TxBuf array contains previously discussed KEYBOARD_INPUT_REPORT structures to be transmitted to the host computer. USBHID_TxBufIdx and USBHID_TxBufCnt implement a circular FIFO buffer for KEYBOARD_INPUT_REPORT structures to be queued into.

void EnqueueTxUSBHID(const KEYBOARD_INPUT_REPORT *elem)
{
    unsigned int end = (USBHID_TxBufIdx + USBHID_TxBufCnt) % _USBHID_TXBUFLEN;

    memcpy(&USBHID_TxBuf[end], elem, sizeof(KEYBOARD_INPUT_REPORT));

    if (USBHID_TxBufCnt == _USBHID_TXBUFLEN)
        USBHID_TxBufIdx = (USBHID_TxBufIdx + 1) % _USBHID_TXBUFLEN;
    else
        USBHID_TxBufCnt++;
}

void DequeueTxUSBHID(KEYBOARD_INPUT_REPORT *elem)
{
    memcpy(elem, &USBHID_TxBuf[USBHID_TxBufIdx], sizeof(KEYBOARD_INPUT_REPORT));

    USBHID_TxBufIdx = (USBHID_TxBufIdx + 1) % _USBHID_TXBUFLEN;
    USBHID_TxBufCnt--;
}

unsigned int GetTxCountUSBHID(void)
{
    return USBHID_TxBufCnt;
}

int IsTxFullUSBHID(void)
{
    return USBHID_TxBufCnt == _USBHID_TXBUFLEN;
}

The above excerpt shows the circular FIFO buffer implementation of the USB KEYBOARD_INPUT_REPORT structures. EnqueueTxUSBHID function is called by the Keyboard module (Keyboard.c) for registering key presses to be sent to the host computer, and DequeueTxUSBHID function is called by the USB module itself to de-queue the KEYBOARD_INPUT_REPORT structure to be sent. Since the algorithm of a circular FIFO buffer should be very familiar to the readers, I will not further explain the details of it.

Back to the interesting stuff, the following function is called by the MLA USB library and functions as the main USB event handler (note that this event is different from the USB interrupt, as we will see later):

bool USER_USB_CALLBACK_EVENT_HANDLER(USB_EVENT event, void *pdata, uint16_t size)
{
    switch((int)event)
    {
        case EVENT_CONFIGURED:
            USBHIDInit();
            break;
        case EVENT_EP0_REQUEST:
            USBCheckHIDRequest();
            break;
        case EVENT_BUS_ERROR:
            SetUIMessage("F.A.I.L.");
            break;
        default:
            break;
    }
    return true;
}

EVENT_CONFIGURED is raised when the USB stack successfully establishes a connection with the host computer and is ready to be configured. EVENT_EP0_REQUEST (Endpoint 0 Request) event is raised whenever there is a request the Endpoint 0 (the default endpoint for HID) and calls USBCheckHIDRequest function in the MLA USB library to check if the request is an HID request. USBCheckHIDRequest function internally processes all HID requests (e.g. requesting HID descriptors and reports) on the EP0 for the USB module. EVENT_BUS_ERROR is self explanatory and should not be raised during normal operation. In case of an error event, CTL122 sets the segment display to show FAIL.

void USBHIDCBSetReportHandler(void)
{
    USBEP0Receive((uint8_t *)&CtrlTrfData,
            USB_EP0_BUFF_SIZE,
            USBHIDCBSetReportComplete);
}

void USBHIDCBSetReportComplete(void)
{
    /* 1 byte of LED state data should now be in the CtrlTrfData buffer.  Copy
     * it to the OUTPUT report buffer for processing */
    outputReport.value = CtrlTrfData[0];
}

USBHIDCBSetReportHandler function is specified in usb_config.h as USER_SET_REPORT_HANDLER, which is referenced by the MLA USB library to set the USB output report handler. This handler is called by the USBCheckHIDRequest function mentioned above whenever there is an output report received on the Endpoint 0. The handler in turn calls USBEP0Receive function which copies the received report data to the CtrlTrfData buffer (defined in MLA USB library code) and calls USBHIDCBSetReportComplete. USBHIDCBSetReportComplete copies the received data to the outputReport structure (note that, by HID spec., it should only receive one byte of data that contains the keyboard LED statuses).

void USBHIDProcess(void)
{
    /* Check if the IN endpoint is busy, and if it isn't check if we want to send
     * keystroke data to the host. */
    if (HIDTxHandleBusy(USBHIDContext.lastINTransmission) == false)
    {
        // If the transmit queue is not empty
        if (GetTxCountUSBHID() > 0)
        {
            // Dequeue
            DequeueTxUSBHID(&inputReport);
            // Transmit over USB
            USBHIDContext.lastINTransmission = HIDTxPacket(HID_EP,
                (uint8_t *)&inputReport,
                sizeof(inputReport));
        }
    }

    /* Check if any data was sent from the PC to the keyboard device.  Report
     * descriptor allows host to send 1 byte of data.  Bits 0-4 are LED states,
     * bits 5-7 are unused pad bits.  The host can potentially send this OUT
     * report data through the HID OUT endpoint (EP1 OUT), or, alternatively,
     * the host may try to send LED state information by sending a SET_REPORT
     * control transfer on EP0.  See the USBHIDCBSetReportHandler() function. */
    if (HIDRxHandleBusy(USBHIDContext.lastOUTTransmission) == false)
    {
        USBHIDContext.lastOUTTransmission = HIDRxPacket(HID_EP,
                (uint8_t *)&outputReport, sizeof(outputReport));

        // Update keyboard LEDs
        UI_LED_NUMLK = outputReport.leds.numLock;
        UI_LED_CAPSLK = outputReport.leds.capsLock;
        UI_LED_SCRLLK = outputReport.leds.scrollLock;
    }
}

USBHIDProcess is called by the interrupt service routine (declared in Interrupt.c) when a USB interrupt is raised (ISR is called and PIR2bits.USBIF is set).

The function consists of two separate sections: input report handling and output report handling.

The first section (HIDTxHandleBusy) checks if the IN endpoint (CTL122-host computer) handle is busy (USBHIDContext structure is referenced here, and its values are used to reference the previous transmission). If the handle is not busy, it checks if there is any KEYBOARD_INPUT_REPORT structures queued. If no input report is queued, the function will simply move on to handle the output report; otherwise, it will de-queue the input report and transmit it to the host computer. There is no loop to transmit all available queue elements as the USB interrupt is raised very frequently and an interrupt service routine should be as short as possible.

Once the input report is processed, the same thing happens with the output report (host computer-CTL122). The function first checks if there is an output report available, and if it is available, HIDRxPacket is called to update outputReport to the latest value. Once outputReport is updated, the function updates the UI keyboard LED values accordingly.

Note that USBHIDCBSetReportHandler/Complete are not necessary as HIDRxPacket is called here and updates the outputReport. The reason for implementing these functions is to ensure that, if future implementations decide to reference the outputReport arbitrarily, outputReport structure is always up to date.

Lastly (probably should have been the first, but its not bad to have the details first to understand the basics),

void USBHIDInit(void)
{
    // Initialise the variable holding the handle for the last transmission
    USBHIDContext.lastINTransmission = 0;

    // Enable the HID endpoint
    USBEnableEndpoint(HID_EP,
            USB_IN_ENABLED |
            USB_OUT_ENABLED |
            USB_HANDSHAKE_ENABLED |
            USB_DISALLOW_SETUP);

    // Arm OUT endpoint for LED states info from the host
    USBHIDContext.lastOUTTransmission = HIDRxPacket(HID_EP,
            (uint8_t*)&outputReport,
            sizeof(outputReport));
}

USBHIDInit is called by the USB event handler EVENT_CONFIGURED event. The function initialises the USBHIDContexts transmission handles and enables the HID endpoint; thereby, allowing communications with the host. Note that lastINTransmission handle is set to 0 since HIDTxHandleBusy accepts 0 as always-not-busy handle, and lastOUTTransmission handle is set by initiating a new receive sequence to ensure that HIDRxHandleBusy does not hang.

Thats it for this part. In the next part, we will take a look into PS/2 Communication part of the firmware.

CTL122: Firmware Walkthrough (Pt. 2 USB Configuration)

Continuing from the Firmware Walkthrough Part 1, we will begin the part 2 with an overview of the USB stack configuration.

As previously mentioned, instead of implementing the full USB device driver, CTL122 uses the Microchip MLA USB library for USB operations. The MLA requires a few static configuration files to be present in the project for specifying USB stack behaviours and descriptors associated with them.

The MLA USB configuration files are divided into two parts: usb_config.h specifying various definitions for the USB stack behaviours and usb_descriptors.c declaring all the descriptors required by the USB stack and the on-chip controller to allow CTL122 to be detected as an HID (Human Interface Device).

#ifndef USB_CONFIG_H
#define	USB_CONFIG_H

#include <usb/usb_ch9.h>

/** DEFINITIONS ****************************************************/
#define USB_EP0_BUFF_SIZE   8   // Valid Options: 8, 16, 32, or 64 bytes.
                                // Using larger options take more SRAM, but
                                // does not provide much advantage in most types
                                // of applications.  Exceptions to this, are applications
                                // that use EP0 IN or OUT for sending large amounts of
                                // application related data.

#define USB_MAX_NUM_INT     1   // For tracking Alternate Setting
#define USB_MAX_EP_NUMBER   1

//Make sure only one of the below "#define USB_PING_PONG_MODE"
//is uncommented.
//#define USB_PING_PONG_MODE USB_PING_PONG__NO_PING_PONG
#define USB_PING_PONG_MODE USB_PING_PONG__FULL_PING_PONG
//#define USB_PING_PONG_MODE USB_PING_PONG__EP0_OUT_ONLY
//#define USB_PING_PONG_MODE USB_PING_PONG__ALL_BUT_EP0		//NOTE: This mode is not supported in PIC18F4550 family rev A3 devices

//#define USB_POLLING
#define USB_INTERRUPT

/* Parameter definitions are defined in usb_device.h */
#define USB_PULLUP_OPTION USB_PULLUP_ENABLE
//#define USB_PULLUP_OPTION USB_PULLUP_DISABLED

#define USB_TRANSCEIVER_OPTION USB_INTERNAL_TRANSCEIVER
//External Transceiver support is not available on all product families.  Please
//  refer to the product family datasheet for more information if this feature
//  is available on the target processor.
//#define USB_TRANSCEIVER_OPTION USB_EXTERNAL_TRANSCEIVER

#define USB_SPEED_OPTION USB_FULL_SPEED
//#define USB_SPEED_OPTION USB_LOW_SPEED //(not valid option for PIC24F devices)

#define MY_VID 0x04D8
#define MY_PID 0x0055

//------------------------------------------------------------------------------------------------------------------
//Option to enable auto-arming of the status stage of control transfers, if no
//"progress" has been made for the USB_STATUS_STAGE_TIMEOUT value.
//If progress is made (any successful transactions completing on EP0 IN or OUT)
//the timeout counter gets reset to the USB_STATUS_STAGE_TIMEOUT value.
//
//During normal control transfer processing, the USB stack or the application 
//firmware will call USBCtrlEPAllowStatusStage() as soon as the firmware is finished
//processing the control transfer.  Therefore, the status stage completes as 
//quickly as is physically possible.  The USB_ENABLE_STATUS_STAGE_TIMEOUTS 
//feature, and the USB_STATUS_STAGE_TIMEOUT value are only relevant, when:
//1.  The application uses the USBDeferStatusStage() API function, but never calls
//      USBCtrlEPAllowStatusStage().  Or:
//2.  The application uses host to device (OUT) control transfers with data stage,
//      and some abnormal error occurs, where the host might try to abort the control
//      transfer, before it has sent all of the data it claimed it was going to send.
//
//If the application firmware never uses the USBDeferStatusStage() API function,
//and it never uses host to device control transfers with data stage, then
//it is not required to enable the USB_ENABLE_STATUS_STAGE_TIMEOUTS feature.

#define USB_ENABLE_STATUS_STAGE_TIMEOUTS    //Comment this out to disable this feature.  

//Section 9.2.6 of the USB 2.0 specifications indicate that:
//1.  Control transfers with no data stage: Status stage must complete within 
//      50ms of the start of the control transfer.
//2.  Control transfers with (IN) data stage: Status stage must complete within 
//      50ms of sending the last IN data packet in fullfilment of the data stage.
//3.  Control transfers with (OUT) data stage: No specific status stage timing
//      requirement.  However, the total time of the entire control transfer (ex:
//      including the OUT data stage and IN status stage) must not exceed 5 seconds.
//
//Therefore, if the USB_ENABLE_STATUS_STAGE_TIMEOUTS feature is used, it is suggested
//to set the USB_STATUS_STAGE_TIMEOUT value to timeout in less than 50ms.  If the
//USB_ENABLE_STATUS_STAGE_TIMEOUTS feature is not enabled, then the USB_STATUS_STAGE_TIMEOUT
//parameter is not relevant.

#define USB_STATUS_STAGE_TIMEOUT     (uint8_t)45   //Approximate timeout in milliseconds, except when
                                                //USB_POLLING mode is used, and USBDeviceTasks() is called at < 1kHz
                                                //In this special case, the timeout becomes approximately:
//Timeout(in milliseconds) = ((1000 * (USB_STATUS_STAGE_TIMEOUT - 1)) / (USBDeviceTasks() polling frequency in Hz))
//------------------------------------------------------------------------------------------------------------------

#define USB_SUPPORT_DEVICE

#define USB_NUM_STRING_DESCRIPTORS 3

/** DEVICE CLASS USAGE *********************************************/
#define USB_USE_HID

/** ENDPOINTS ALLOCATION *******************************************/

/* HID */
#define HID_INTF_ID             0x00
#define HID_EP 			1
#define HID_INT_OUT_EP_SIZE     1
#define HID_INT_IN_EP_SIZE      8
#define HID_NUM_OF_DSC          1
#define HID_RPT01_SIZE          63
//#define USER_GET_REPORT_HANDLER USBHIDCBGetReportHandler	
#define USER_SET_REPORT_HANDLER USBHIDCBSetReportHandler	

/** DEFINITIONS ****************************************************/

#endif	/* USB_CONFIG_H */

The usb_config.h above is based on that of the Microchip MLA HID example. The configuration begins with the specification of the endpoints. The USB stack is configured to use the internal (on-chip) USB transceiver and generate interrupts when there are USB events. Although not absolutely necessary, the device is configured to operate at the Full Speed (12Mb/s) to minimise any potential problems with modern devices that are used to dealing with Full/High/Super Speed devices. Some random VID (vendor ID) and PID (product ID) are specified and HID-related parameters (endpoint count, size, descriptors, etc.) are defined.

The following is an excerpt from usb_descriptors.c showing the primary device descriptor:

/* Device Descriptor */
const USB_DEVICE_DESCRIPTOR device_dsc =
{
    0x12,                   // Size of this descriptor in bytes
    USB_DESCRIPTOR_DEVICE,  // DEVICE descriptor type
    0x0200,                 // USB Spec Release Number in BCD format
    0x00,                   // Class Code
    0x00,                   // Subclass code
    0x00,                   // Protocol code
    USB_EP0_BUFF_SIZE,      // Max packet size for EP0, see usb_config.h
    MY_VID,                 // Vendor ID
    MY_PID,                 // Product ID: Keyboard fw demo
    0x0001,                 // Device release number in BCD format
    0x01,                   // Manufacturer string index
    0x02,                   // Product string index
    0x00,                   // Device serial number string index
    0x01                    // Number of possible configurations
};

Device descriptor is what identifies a USB device when it is attached to a host. The descriptor contains device class and protocol codes that identify the type of the device (in this case, not used since set to zero- this information is defined in sub-descriptors), VID/PID that identify the manufacturer and the product, and the indexes to the string descriptors containing the string manufacturer/product information.

 /* Configuration 1 Descriptor */
const uint8_t configDescriptor1[] = {
    /* Configuration Descriptor */
    0x09, //sizeof(USB_CFG_DSC),    // Size of this descriptor in bytes
    USB_DESCRIPTOR_CONFIGURATION,   // CONFIGURATION descriptor type
    DESC_CONFIG_WORD(0x0029),       // Total length of data for this cfg
    1,                              // Number of interfaces in this cfg
    1,                              // Index value of this configuration
    0,                              // Configuration string index
    _DEFAULT | _SELF | _RWU,        // Attributes, see usb_device.h
    50,                             // Max power consumption (2X mA)

    /* Interface Descriptor */
    0x09, //sizeof(USB_INTF_DSC),   // Size of this descriptor in bytes
    USB_DESCRIPTOR_INTERFACE,       // INTERFACE descriptor type
    0,                              // Interface Number
    0,                              // Alternate Setting Number
    2,                              // Number of endpoints in this intf
    HID_INTF,                       // Class code
    BOOT_INTF_SUBCLASS,             // Subclass code
    HID_PROTOCOL_KEYBOARD,          // Protocol code
    0,                              // Interface string index

    /* HID Class-Specific Descriptor */
    0x09, //sizeof(USB_HID_DSC)+3,  // Size of this descriptor in bytes RRoj hack
    DSC_HID,                        // HID descriptor type
    DESC_CONFIG_WORD(0x0111),       // HID Spec Release Number in BCD format (1.11)
    0x00,                           // Country Code (0x00 for Not supported)
    HID_NUM_OF_DSC,                 // Number of class descriptors, see usbcfg.h
    DSC_RPT,                        // Report descriptor type
    DESC_CONFIG_WORD(63), //sizeof(hid_rpt01),
                                    // Size of the report descriptor

    /* Endpoint Descriptor */
    0x07, /*sizeof(USB_EP_DSC)*/
    USB_DESCRIPTOR_ENDPOINT,        // Endpoint Descriptor
    HID_EP | _EP_IN,                // EndpointAddress
    _INTERRUPT,                     // Attributes
    DESC_CONFIG_WORD(8),            // size
    0x01,                           // Interval

    /* Endpoint Descriptor */
    0x07, /*sizeof(USB_EP_DSC)*/
    USB_DESCRIPTOR_ENDPOINT,        // Endpoint Descriptor
    HID_EP | _EP_OUT,               // EndpointAddress
    _INTERRUPT,                     // Attributes
    DESC_CONFIG_WORD(8),            // size
    0x01                            // Interval

};

The excerpt above shows the Configuration Descriptor 1 for the device. Note that a USB device may contain multiple configurations (the number of possible configurations is specified in the device descriptor). Since CTL122 is an HID device and does not carry out any other functions, there is only one configuration descriptor.

The configuration descriptor begins with the usual descriptor header identifying the size and type of the descriptor, number of interfaces and configuration index. Following the header, the interface descriptor specifies the interface number, class codes (note that the device descriptor class codes were set to 0) and number of endpoints for the interface. Besides the interface descriptor, there is an additional descriptor for the HID class called HID Class-specific Descriptor. This descriptor essentially acts as the header for the report descriptors that we will see later. The Configuration 1 descriptor ends with the two endpoint descriptors (note that two endpoints were specified in the interface descriptor section). Each endpoint descriptor identifies the direction (IN or OUT), type (INTERRUPT) and size of the endpoint. The endpoint type INTERRUPT, in the context of USB specification, does not necessarily refer to hardware interrupt of the microcontroller, but rather the type of endpoint that transfers small amount data upon a request by the host.

// Array of configuration descriptors
const uint8_t *const USB_CD_Ptr[] =
{
    (const uint8_t *const)&configDescriptor1
};

// Array of string descriptors
const uint8_t *const USB_SD_Ptr[] =
{
    (const uint8_t *const)&sd000,
    (const uint8_t *const)&sd001,
    (const uint8_t *const)&sd002
};

All the individual configuration descriptors and string descriptors (code not included as they arent all that interesting) are referenced by the declaration above (USB_CD_Ptr and USB_SD_Ptr are statically referenced by the MLA USB stack and must be present at the time of linking).

The following is the body of the report descriptor previously mentioned above:

// Class specific descriptor - HID Keyboard
const struct {
    uint8_t report[HID_RPT01_SIZE];
} hid_rpt01 = { {
    0x05, 0x01,                    // USAGE_PAGE (Generic Desktop)
    0x09, 0x06,                    // USAGE (Keyboard)
    0xa1, 0x01,                    // COLLECTION (Application)
    0x05, 0x07,                    //   USAGE_PAGE (Keyboard)
    0x19, 0xe0,                    //   USAGE_MINIMUM (Keyboard LeftControl)
    0x29, 0xe7,                    //   USAGE_MAXIMUM (Keyboard Right GUI)
    0x15, 0x00,                    //   LOGICAL_MINIMUM (0)
    0x25, 0x01,                    //   LOGICAL_MAXIMUM (1)
    0x75, 0x01,                    //   REPORT_SIZE (1)
    0x95, 0x08,                    //   REPORT_COUNT (8)
    0x81, 0x02,                    //   INPUT (Data,Var,Abs)
    0x95, 0x01,                    //   REPORT_COUNT (1)
    0x75, 0x08,                    //   REPORT_SIZE (8)
    0x81, 0x03,                    //   INPUT (Cnst,Var,Abs)
    0x95, 0x05,                    //   REPORT_COUNT (5)
    0x75, 0x01,                    //   REPORT_SIZE (1)
    0x05, 0x08,                    //   USAGE_PAGE (LEDs)
    0x19, 0x01,                    //   USAGE_MINIMUM (Num Lock)
    0x29, 0x05,                    //   USAGE_MAXIMUM (Kana)
    0x91, 0x02,                    //   OUTPUT (Data,Var,Abs)
    0x95, 0x01,                    //   REPORT_COUNT (1)
    0x75, 0x03,                    //   REPORT_SIZE (3)
    0x91, 0x03,                    //   OUTPUT (Cnst,Var,Abs)
    0x95, 0x06,                    //   REPORT_COUNT (6)
    0x75, 0x08,                    //   REPORT_SIZE (8)
    0x15, 0x00,                    //   LOGICAL_MINIMUM (0)
    0x25, 0x65,                    //   LOGICAL_MAXIMUM (101)
    0x05, 0x07,                    //   USAGE_PAGE (Keyboard)
    0x19, 0x00,                    //   USAGE_MINIMUM (Reserved (no event indicated))
    0x29, 0x65,                    //   USAGE_MAXIMUM (Keyboard Application)
    0x81, 0x00,                    //   INPUT (Data,Ary,Abs)
    0xc0                           // End Collection
} };

The report descriptor contains various HID device-specific information, including the types of available keys and inputs. The specific details of the HID report descriptor is out of the scope of this article and I encourage you to refer to the USB specification for further details.

CTL122: Firmware Walkthrough (Pt. 1 Overview)

The primary role of CTL122 firmware is to provide the PS/2 and USB interface drivers and handle PS/2 to USB scancode conversion. It will also provide various custom key handling logic to allow CFG and MEM function handling as specified in the previous articles of this series.

The following is the list of the files in the CTL122 firmware source code and brief descriptions of what they do:
[mla/usb] Microchip Libraries for Applications (MLA) USB Device Driver
Init.c Microcontroller Initialisation Routines
Interrupt.c Interrupt Service-related Routines
Keyboard.c Key Processing Logic
Main.c Entry Point
Port.h I/O Port Declaration
PS2.c PS/2 Device Driver/Queue
Serial.c Serial Port Debug Driver
system.h/system_config.h Header Files required by MLA
UI.c - UI Processing Routines
USB.c USB Device Driver/Queue
usb_config.h/usb_descriptors.c - MLA USB Library Initialisation Objects
Util.c Useful Functions

CTL122: Microcontroller PinoutReferring back to the schematic, we will first begin with the port definitions in Port.h.

/******************************************************************************/
/* Port Definitions                                                           */
/******************************************************************************/
#define PS2_CLK     PORTCbits.RC2
#define PS2_DATA    PORTCbits.RC1

#define LED_PWR     LATAbits.LATA0
#define LED_CFG     LATAbits.LATA1
#define LED_MEM     LATAbits.LATA2
#define LED_NUMLK   LATAbits.LATA3
#define LED_CAPSLK  LATAbits.LATA4
#define LED_SCRLLK  LATAbits.LATA5

#define DISP_EN1    LATBbits.LATB0
#define DISP_EN2    LATBbits.LATB1
#define DISP_EN3    LATBbits.LATB2
#define DISP_EN4    LATBbits.LATB3
#define DISP_A      LATDbits.LATD0
#define DISP_B      LATDbits.LATD1
#define DISP_C      LATDbits.LATD2
#define DISP_D      LATDbits.LATD3
#define DISP_E      LATDbits.LATD4
#define DISP_F      LATDbits.LATD5
#define DISP_G      LATDbits.LATD6
#define DISP_P      LATDbits.LATD7

All ports will be referenced using the aliases provided above throughout the entire source code to allow easy adaption of the firmware.

/******************************************************************************/
/* Configuration                                                              */
/******************************************************************************/

// Generic
#pragma config PWRT = OFF
#pragma config BOR = ON
#pragma config BORV = 3
#pragma config WDT = OFF
#pragma config LVP = OFF
#pragma config MCLRE = OFF
#pragma config VREGEN = ON

// Clock Subsystem
#pragma config FOSC = HSPLL_HS
#pragma config PLLDIV = 5
#pragma config CPUDIV = OSC1_PLL2
#pragma config USBDIV = 2
#pragma config IESO = OFF
#pragma config FCMEN = OFF
#pragma config LPT1OSC = OFF

Now we will take a look at the static microcontroller configurations. The first line disables the Power-up Timer (PWRT) bit as it is not necessary to provide any power up delay for slow Vdd rise (with USB, Vdd will rise almost instantaneously). In the next line, Brown-out Reset (BOR) bit is set to allow microcontroller to safely reset in case of voltage dip- the following line sets brown-out voltage to be 3V. WDT = OFF turns off the watchdog timer, LVP = OFF disables the low voltage programming feature through ICSP (an extra programming pin is required for this feature and it was not implemented on the board as the feature was deemed unnecessary), MCLRE = OFF disables the Power-up Reset pin and allows it to be utilised as a normal I/O pin (in this case, it is simply reserved as ICSP programming Vpp pin. Either way, we saved one resistor and a capacitor), VREGEN = ON enables the on-chip 3V3 USB regulator (VUSB) to supply the USB core.

PIC18F4550 Clock Subsystem Diagram

For the clock subsystem configuration, FOSC = HSPLL_HS instructs the microcontroller to utilise an external HS oscillator clock connected through PLL, PLLDIV = 5 selects the divide-by-5 (input clock = 20MHz) PLL Prescaler output to supply 4MHz clock for the USB PLL. CPUDIV = OSC1_PLL2 sets the PLL Postscaler multiplexer to select divide-by-2 output of the PLL frequency (note that PLL frequency is 96MHz and /2 = 48MHz), which in turn is supplied to the CPU and peripherals. USBDIV = 2 selects the USB peripheral clock to be that of the output of the PLL. IESO = OFF disables Internal-External Clock Source Switchover in order to force the use of the external clock at all times, FCMEN = OFF disables the Fail-Safe Clock Monitor, once again, to force the use of the external oscillator at all times. LPT1OSC = OFF forces the microcontroller Timer 1 to operate in high power mode as (according to the datasheet) Timer 1 low power mode is more sensitive to interference. I sincerely hope this rather detailed description of the configuration bits lessened your task of reading over the 438 page datasheet for this microcontroller.

Moving on, we will now take a look at the entry point of the firmware.

/******************************************************************************/
/* Main Program                                                               */
/******************************************************************************/
void main(void)
{
    // Initialise I/O
    InitIO();
    // Initialise capture
    InitCapture();
    // Initialise timer
    InitTimer();
    // Initialise EUSART
    InitEUSART();
    // Initialise USB
    InitUSB();

    // Print init. message
    printf("CTL122 Initialised\r\n");

    // Set standby UI state
    SetUIMessage("I N I T ");

    // Processing Loop
    while (TRUE)
    {
        ProcessUI();
    }
}

The entry point begins with calling initialisation functions (InitIO, InitCapture, InitTimer, InitEUSART, InitUSB). These functions are declared in Init.c and they perform the initialisation of the microcontroller functions and its peripherals. After initialisation, a debug message is printed to the UART output and the entry point enters user interface processing loop. Note that printf is redirected to UART output because the PIC C library printf function internally calls putch for console output and putch body is declared in Serial.c to output all its inputs to the UART port. Also note that UART pins (TX and RX) are left unconnected on the PCB. This debug print function was used in my prototype development and its usage is now deprecated as printf, when used in frequently called routines (e.g. interrupt service routine), causes a serious system slow down. The entry point loop does not handle anything other than UI processing and all I/O peripheral related processing is done by the interrupt handler. This ensures that I/O requests are prioritised over UI handling.

void InitIO(void)
{
    // Enable PS/2 inputs
    TRISCbits.TRISC2 = 1; // PS2_CLK
    TRISCbits.TRISC1 = 1; // PS2_DATA
    // Enable indicator LED outputs
    TRISAbits.TRISA0 = 0;
    LED_PWR = 0;
    TRISAbits.TRISA1 = 0;
    LED_CFG = 0;
    TRISAbits.TRISA2 = 0;
    LED_MEM = 0;
    TRISAbits.TRISA3 = 0;
    LED_NUMLK = 0;
    TRISAbits.TRISA4 = 0;
    LED_CAPSLK = 0;
    TRISAbits.TRISA5 = 0;
    LED_SCRLLK = 0;
    // Enable 7-segment display outputs
    TRISDbits.TRISD0 = 0; // A
    DISP_A = 0;
    TRISDbits.TRISD1 = 0; // B
    DISP_B = 0;
    TRISDbits.TRISD2 = 0; // C
    DISP_C = 0;
    TRISDbits.TRISD3 = 0; // D
    DISP_D = 0;
    TRISDbits.TRISD4 = 0; // E
    DISP_E = 0;
    TRISDbits.TRISD5 = 0; // F
    DISP_F = 0;
    TRISDbits.TRISD6 = 0; // G
    DISP_G = 0;
    TRISDbits.TRISD7 = 0; // P
    DISP_P = 0;
    TRISBbits.TRISB0 = 0; // EN1
    DISP_EN1 = 0;
    TRISBbits.TRISB1 = 0; // EN2
    DISP_EN2 = 0;
    TRISBbits.TRISB2 = 0; // EN3
    DISP_EN3 = 0;
    TRISBbits.TRISB3 = 0; // EN4
    DISP_EN4 = 0;
}

InitIO function initialises all defined microcontroller I/O ports. For each port, the function sets the value of the corresponding TRIS register (0 if output, 1 if input) and initialises its value to 0 if the port is an output.

void InitTimer(void)
{
    // Initialise Timer 0
    // Set clock source to internal instruction clock
    T0CONbits.T0CS = 0;
    // Set prescaler bypass (NO)
    T0CONbits.T0PS = 0b100;
    T0CONbits.PSA = 0;
    // Set 8-bit counter
    T0CONbits.T08BIT = 0;
    // Enable interrupts
    INTCONbits.TMR0IE = 1;
    INTCONbits.PEIE = 1;
    INTCONbits.GIE = 1;
    // Enable timer
    T0CONbits.TMR0ON = 1;

    // Initialise Timer 2
    OpenTimer2(TIMER_INT_ON & T2_PS_1_1);
}

InitTimer function initialises the two timers (Timer 0 and Timer 2) required for firmware operation. Technically, Timer 0 is not absolutely required, but it was used for heartbeat LED flash during prototyping. It still serves the same purpose. The Timer 2 provides the polling interval for the USB handler. Note that Timer 0 is manually initialised by setting individual hardware registers, while Timer 2 is automatically initialised by the library function OpenTimer2. The reason for that is quite simple, I got lazy. I was planning on rewriting the Timer 0 init. routine with the library call as well for the sake of consistency, but I decided to keep as is for the purpose of demonstration of the manual timer initialisation method.

void InitCapture(void)
{
    // Initialise Capture 1
    OpenCapture1(C1_EVERY_FALL_EDGE & CAPTURE_INT_ON);
    // Initialise the Timer 1
    OpenTimer1(TIMER_INT_ON & T1_SOURCE_EXT & T1_PS_1_1 & T1_16BIT_RW);
    WriteTimer1(0x0000);
}

InitCapture initialises the capture module (CCP1) used for PS/2 interface. It configures the capture module to trigger its interrupt on every falling edge of the PS2_CLK clock. It also initialises the Timer 1 for PS/2 port UART transmission timeout detection. Timer 1 is reset at every start bit trigger and reinitialises the UART receive routine if a timeout occurs.

void InitEUSART(void)
{
    // Initialise I/O ports for EUSART
    TRISCbits.RC6 = 0;
    TRISCbits.RC7 = 0;
    // Open EUSART
    OpenUSART(USART_TX_INT_OFF &
            USART_RX_INT_OFF &
            USART_ASYNCH_MODE &
            USART_EIGHT_BIT &
            USART_BRGH_HIGH, 25);
}

InitEUSART function, as previously mentioned, initialises the UART for debug message print. Its function is not critical at this stage of development; nonetheless, it is still worth mentioning. Note that the UART will only transmit and both TX and RX interrupts are disabled.

void InitUSB(void)
{
    USBDeviceInit();
    USBDeviceAttach();
}

InitUSB function initialises the USB device driver/stack. The functions called here (USBDeviceInit, USBDeviceAttach) are from the MLA USB library. USBDeviceInit initialises the USB stack and USBDeviceAttach informs the USB stack that the device (this converter) is ready to be attached to the bus.

CTL122: Hardware Design (Pt. 2)

This article is a continuation of the Hardware Design Part 1. In this part, we will look into the PCB design of the CTL122.

ctl122-3dtop

The left image is a 3D rendering of the CTL122 PCB.

As briefly described in the previous part of this article, there are four 7-segment displays and six indicator LEDs.

The four 7-segment displays will be used to display the operational state of the CTL122 as well as the available options in CFG (configuration) and MEM (memory/macro) modes, and the six indicator LEDs will display power, mode and keyboard states (Num Lock, Caps Lock, Scroll Lock).

There are three interface ports (excluding ICSP): RJ45 connector for the IBM terminal keyboards, mini-DIN for the standard PS/2 keyboards and USB for primary power supply and interfacing with the host computer. Besides interface ports, ICSP is also available to allow easy programming and debugging of the device.

The CTL122 PCB is a two layer design with mostly SMD components. Since the top layer contains display elements and will function as a display panel, efforts were made to minimise the top layer component count. The following are the copper patterns for the two layers:

CTL122: PCB Top Layer  CTL122: PCB Bottom Layer

Some components were placed on the bottom side of the PCB to minimise the top layer component count. This board is by no means intended for mass production and putting components on both sides of the PCB wouldnt cause too much trouble for us.

The CTL122 PCB can be ordered from the OSH Park at the following link: https://oshpark.com/shared_projects/qNCwHvCn. Gerber files are also available for those who wish to manufacture one at home.

The component list for the board is available here. The specific component values can be found on the schematic provided in the Part 1.

CTL122: Hardware Design (Pt. 1)

In this article, I will explain the hardware system design and schematic of the CTL122 programmable PS/2-USB translator.

CTL122 will interface the PS/2 devices in two different connector types: 6-position MiniDIN (standard modern PS/2 connector) and RJ45 (8P5C). 6-pos MiniDIN connector is provided to allow CTL122 to be used with DIN/MiniDIN Model M and standard PS/2 keyboards, and RJ45 connector is provided to allow interfacing with the old IBM terminal keyboards such as the 122-key Model M keyboard that this translator is specifically targeting.

RJ45 connector pinout for the IBM terminal keyboards is as follows:

IBM Terminal Keyboard RJ45 Connector PinoutNote that only 5 pins of the RJ45 connector are used (pin 3 through 7). The signals transmitted over this connector are the standard PS/2 signals. Refer to the Interfacing with PS/2 Devices article for detailed explanation on how PS/2 interface works.

One important thing to note is that the older keyboards such as IBM Model M are very power hungry and can draw as much as 500mA on the Vcc. Since CTL122 and the connected PS/2 keyboard will be directly powered by the standard USB port that can only supply a limited amount of current, this may potentially cause a problem later on. To resolve this problem, there will be a header on the PCB to allow an external power source to be connected to the board Vcc.

In practice, both my laptop and desktop had no problem powering the CTL122 prototype and the 122-key Model M keyboard directly from the USB port.

CTL122: SchematicCTL122 Board Schematic

CTL122 has six indicator LEDs and four 7-segment displays to provide the real-time operational information. The three LEDs on the right side provides the standard keyboard state information (Num Lock, Caps Lock, Scroll Lock) and the ones on the left side provides the information about the power state of the CTL122 as well as special operation modes (CFG: Configuration, and MEM: Memory/Macro).

Besides the simple interface and protocol translation functionality, CTL122 provides various extra programmable features at the firmware level. The memory/macro feature allows the operator to press a single key in MEM mode to perform a sequence of key presses. This feature is useful in many games and certain applications where repetitive sequences of key presses are required. The CFG mode of the converter allows it to be configured in a specific manner to provide special key mappings and handling.

Back to the schematic, PIC18F4550 is used as the microcontroller for CTL122. One of the best features of the PIC18F4550 that makes it ideal for this project is that it has an embedded USB controller. Supporting a USB interface with this microcontroller is as simple as routing a differential pair from the USB connector to its USB D+ and D- pins. Note that VUSB is connected to an external 220n capacitor to allow the on-chip 3V3 regulator to operate properly. Embedded USB controller WILL NOT work without this capacitor.

The PS/2 interface (PS2_CLK and PS2_DATA) is connected to one of the general purpose I/O pins of the microcontroller. An on-chip USART could have been utilised for this purpose; however, it was deemed unnecessary as it could have easily been implemented with general purpose I/O and CCP. Note that the PS2_CLK is connected to the CCP1 to utilise the capture module to detect PS/2 synchronous serial clock. The details of this implementation will be explained in the Firmware article of this series.

An ICSP (in-circuit serial programming) header is provided to allow easy programming and testing of the board. Since the ICSP header includes VCC and GND pins, it is dual-purposed to allow external power connection when the USB port alone cannot provide enough current to power both the converter and the connected keyboard.

An external 20MHz crystal operating in HS mode is provided to allow USB core operation of the PIC18F4550. On-chip internal oscillators do not support USB core operation and an external clock source must be provided. It is important to note that there is a limited number of predefined external clock frequencies that can be used with USB core of this microcontroller.

PIC18F4550 Clock Subsystem Diagram

PIC18F4550 Clock Subsystem Diagram

The diagram above shows the clock subsystem of the PIC18F4550 microcontroller. In order to supply the USB peripheral clock, an external clock source must be provided through OSC1 and OSC2. The external clock is then divided to 4MHz by the PLL prescaler and then multiplied to 96MHz by the PLL which, in turn, provides the primary clock (for CPU core and other peripherals) and the USB clock through a /2 divider at 48MHz. It is critical that an external clock of 48MHz is provided, or an external clock with the frequency that can be divided to 4MHz by the PLL prescaler is provided.

For the sake of keeping this article at a reasonable length, I will not go into explaining all the details about the LEDs and 7-segment displays. It is, however, worth mentioning about the bypass capacitors on this one. Bypass capacitors are absolutely critical in this design as relatively high speed USB is involved and the USB power rail, especially when USB hubs are involved, is a nightmare (loaded with noise).

CTL122: IBM 122-key Model M Keyboard

IBM Model M 122-key Keyboard Layout

IBM 122-key Model M Keyboard Layout

The 122-key version of IBM Model M keyboard uses the Scancode Set 3, as compared to the Scancode Set 1 used by the standard PC keyboards. One of the main differences between the Scancode Set 1 and 3 is that the Set 3, by default, does not send key-up (break) sequences for most keys. Only special keys (L-/R-Ctrl, L-/R-Alt, L-/R-Shift, CapsLk) send the key-up sequences with 0xF0 in front of the key scancode (e.g. L-Ctrl key-down = { 011 }; L-Ctrl key-up = { 0xF0, 011 }).

IBM Model M 122-key Keyboard ScancodeIBM 122-key Model M Keyboard Scancode Map

As made obvious by the illustration above, another difference between the Scancode Set 3 and 1 is that the scancode values for the keys are completely different. For example, A key in the Scancode Set 1 is 0x1E, while it is 0x1C in the Scancode Set 3.

These differences make it difficult to directly interface the 122-key IBM Model M keyboard, despite having a standard PS/2 interface, to a modern PC. There are possible workarounds at the keyboard driver level to let the operating system handle the Scancode Set 3; however, these workarounds are not universal and limit the portability of the keyboard.

const char Kbd_TransTbl[256] = {
//  0     1     2     3     4     5     6     7
    0x00, 0x00, 0x00, 0x00, 0x00, 0x29, 0x00, 0x3A,   // 0
    0x68, 0x00, 0x00, 0x00, 0x00, 0x2B, 0x35, 0x3B,   // 1
    0x69, 0x00, 0x00, 0x64, 0x00, 0x14, 0x1E, 0x3C,   // 2
    0x6A, 0x00, 0x1D, 0x16, 0x04, 0x1A, 0x1F, 0x3D,   // 3
    0x6B, 0x06, 0x1B, 0x07, 0x08, 0x21, 0x20, 0x3E,   // 4
    0x6C, 0x2C, 0x19, 0x09, 0x17, 0x15, 0x22, 0x3F,   // 5
    0x6D, 0x11, 0x05, 0x0B, 0x0A, 0x1C, 0x23, 0x40,   // 6
    0x6E, 0x00, 0x10, 0x0D, 0x18, 0x24, 0x25, 0x41,   // 7
    0x6F, 0x36, 0x0E, 0x0C, 0x12, 0x27, 0x26, 0x42,   // 8
    0x70, 0x37, 0x38, 0x0F, 0x33, 0x13, 0x2D, 0x43,   // 9
    0x71, 0x00, 0x34, 0x31, 0x2F, 0x2E, 0x44, 0x72,   // 10
    0x00, 0x00, 0x28, 0x30, 0x00, 0x00, 0x45, 0x73,   // 11
    0x51, 0x50, 0x00, 0x52, 0x4C, 0x4D, 0x2A, 0x49,   // 12
    0x00, 0x59, 0x4F, 0x5C, 0x5F, 0x4E, 0x4A, 0x4B,   // 13
    0x62, 0x63, 0x5A, 0x5D, 0x5E, 0x60, 0x53, 0x54,   // 14
    0x00, 0x58, 0x5B, 0x00, 0x57, 0x61, 0x55, 0x00,   // 15
    0x00, 0x00, 0x00, 0x00, 0x56, 0x00, 0x00, 0x00,   // 16
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 17
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 18
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 19
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 20
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 21
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 22
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 23
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 24
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 25
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 26
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 27
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 28
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 29
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,   // 30
    0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00    // 31
};

Scancode Set 3 to USB Scancode Translation Table

Since CTL122 will be interfacing the PS/2 122-key Model M keyboard to a standard USB HID device, USB HID scancodes are used in the translation table.

CTL122: Introduction

I bought a 122-key IBM Model M mechanical keyboard some time in January and made a fully programmable PS/2 to USB converter for translating the non-standard 122-key Model M keyboard scan codes (standard active PS/2 to USB converters cannot be used here because the scan codes sent by this keyboard is completely different to those of the standard PC keyboard; moreover, this keyboard does not send key-up scan codes for the keys other than CTRL/SHIFT/ALT).

The following picture shows the 122-key IBM Model M keyboard (somewhat French Canadian layout, P/N 1395662, 15NOV86):

20131114_190502

This keyboard was originally intended for use with the IBM terminals and the key layout is significantly different from that of the standard 102-key PC keyboard- from having additional function keys F13 through F24 and having Trait/Pos 1 key in the arrow key section to having extra two columns of keys at the left side of the keyboard.

20140715_203105  20140715_203214
The pictures above show my quick-and-dirty perfboard implementation of the PS/2-USB converter. As with most perfboard designs, wires are all over the place and the lack of a proper ground plane caused various signal integrity/interference issues. For example, the converter sometimes malfunctions when new devices are plugged into the USB hub that it is plugged into. After tens of minutes of fiddling around, it was evident that the issue wasnt about not having enough bypass caps on the board, but more of the noisy ground that bounces around.

Since I am using this keyboard on a daily basis and plan to do so for many years to come, I decided to design a proper PCB: CTL122.

In the next article, I will explain the 122-key IBM Model M keyboard scan code and overall implementation details.

System/14: System I/O Map

System/14 I/O address space layout is based on that of the IBM PC and other compatible systems. The I/O address space of the System/14, however, contains additional I/O devices at the addresses that are not occupied by the original IBM PC design.

The following table shows the I/O address space layout:
RANGE      SIZE  DESCRIPTION
0000-001F  2^5   8237 Programmable DMA Controller #1
0020-003F  2^5   8259 Programmable Interrupt Controller #1
0040-005F  2^5   8253 Programmable Interval Timer
0060-006F  2^4   8042 Keyboard Controller
0070-007F  2^4   MC146818 Real-time Clock
0090-009F  2^4   8255 Programmable Peripheral Interface (NON-STANDARD)
00A0-00AF  2^4   8259 Programmable Interrupt Controller #2
00C0-00DF  2^5   8237 Programmable DMA Controller #2

The address space layout, at this stage, is incomplete and may be modified in the future. All modifications to the layout will be described in the CHANGELOG section below.

The Phase 1 implementation of the System/14 may not fully implement the address space layout specified above. Detailed information on the Phase 1 system I/O map will be provided in appropriate future Phase 1 development documentations.

CHANGELOG
11/07/2014  Initial revision

System/14: 8282 Address Latch and 8286 Bus Transceiver

The 8282 octal address latch and 8286 octal bus transceiver bridge the processor local bus (which may consist of a single 8086, or with all the other coprocessors such as 8087 and 8089) to the system bus.

Intel 8282 Pinout          Intel 8286 Pinout

An address latch is required for system bus operation because the 8086 and coprocessors use a multiplexed address/data bus, and address and data may not be simultaneously available on the processor local bus. The 8282 address latch is directly controlled by the 8288 bus controller (refer to 8288 Bus Controller article) and latches the address available on DI7:DI0 from the processor local bus when STB (strobe) input from the ALE (address latch enable) output of 8288 is activated.

This latched address is output to the DO7:DO0 when \OE goes active. For a single bus master design, the \OE input of the 8282 is directly connected to GND to always enable its DO7:DO0 outputs. If an 8237 DMA is used in the system, this input needs to be connected to the active high AEN output of the DMA controller along with the other necessary supporting logic. This will be further discussed in the future 8237 article.

Application Note: A-140

The 8286 bus transceiver buffers the content of either the processor local bus or the system bus depending on the state of the T (transmit) input. When both T and \OE are active, the lines A7:A0, the processor local bus side, are buffered and output to the B7:B0, the system bus side. When T is inactive and \OE is active, the lines B7:B0 are buffered and output to A7:A0, thereby allowing peripherals to transmit data into the processor local bus.

The T input of the 8286 is connected to the DT/R (data transmit when HIGH and receive when LOW) output of the 8288 bus controller. \OE input of the 8286 is not directly connected to the DEN output of the 8288, however. Note that the \OE input of the 8286 is active low and the DEN of the 8288 is active high. This calls for an inverting logic between the DEN and \OE. The purpose of having the active low \OE in 8286 is to easily allow the processor local bus peripherals to disable the bus transceiver on demand. This is shown in the diagram above- note that the 8259A interrupt controller SP/EN output is NANDed with DEN from the 8288 bus controller.

This allows the 8259A to disable the bus transceiver, thereby isolating the processor local bus and the system bus during its I/O and interrupt acknowledge cycles (interrupt acknowledge cycle utilises the processor local bus to transfer the interrupt vector to the processor). By isolating the processor local bus from the the system bus, the two buses can function independent of each other (e.g. interrupt acknowledge on the processor local bus while a DMA transfer is in progress on the system bus).

 

8282/8286 Logic Analyser Capture

The logic analyser capture shows the boot sequence of the 8086 system. The analyser was configured to trigger on ADDR[19:0] at 0xFFFF0 and it triggered as soon as I let \RES go high after the initial power up. Note that 0xFFFF0 is the processor bootstrap address for 8086 and the processor attempts code access (S[2:0] = 0b100) as soon as the RESET signal is deactivated. The rest is gibberish since the data bus was still left unconnected (8286 bus transceivers were connected to the processor local bus; however, the system bus side of the 8286s were left unconnected).

Here is a photo of the breadboarded 8284/8086/8288/38282/28286. As you may observe from the photo below, I am not a really big fan of cutting wires to length when breadboarding.

8282/8286 BreadboardFirst column: 8284 clock generator, 8086 processor, 8288 bus controller.
Second column: 8282 address latch A0:A7, 8282 A8:A15, 8282 A16:A19, 8286 D0:D7, 8286 D8:D15.
Third column: 74LS00 quad NAND

 

System/14: 8288 Bus Controller

Intel 8288 is a bus controller that decodes the maximum mode 8086 S[2:0] encoded bus control signal outputs to individual bus control signals.

Intel 8288 Pinout

S[2:0] control signals are connected from the 8086 processor to the S0, S1 and S2 inputs of 8288 for decoding. The S signals are decoded as follows:

S2 S1 S0  DESC           COMMAND
0  0  0   INTERRUPT ACK  \INTA
0  0  1   READ I/O PORT  \IORC
0  1  0   WRITE I/O PORT \IOWC, \AIOWC
0  1  1   HALT           None
1  0  0   CODE ACCESS    \MRDC
1  0  1   READ MEMORY    \MRDC
1  1  0   WRITE MEMORY   \MWTC, \AMWC
1  1  1   PASSIVE        None

The individual decoded bus control signals are used to control 8282 latch, 8286 transceiver and memory devices on the bus.

The following picture shows, from the top, an 8284 clock generator, 8086 processor and 8288 bus controller breadboarded for testing:

8288 Breadboard

The CLK, RESET and READY output of the 8284 were connected to the 8086 to supply the main clock and reset signals. The 8288 bus controller was connected to S[2:0] output of 8086 for testing. At this stage, the multiplexed address/data bus of 8086 was left unconnected as the purpose of this test was to verify that 8288 was functional. Despite being unconnected, the 8086 will read at least one instruction that is decoded as a memory reference from the unconnected floating bus purely by chance.

8288 Logic Analyser CaptureAnd of course, the 8086 managed to decode a memory reference instruction from the gibberish read from the unconnected address/data bus.

At the beginning of the capture, the S[2:0] line goes from 111 (passive) to 100 (code access). This is the fetch cycle of the 8086 processor used to fetch instructions from the code memory to its internal instruction queue. This transition activates the ALE (address latch enable) signal to force 8282 (which I do not possess at the moment- I am planning to visit a local junkyard soon to search for one) to latch the target address present on the processor data bus. Following the address latch signal is the DT/R (data transmit/receive). DT/R indicates data transmit (processor-to-bus) when HIGH and data receive (bus-to-processor) when LOW. In this case, DT/R is set to LOW to configure the 8286 bus transceiver in receive mode. This is shortly followed by the MRDC signal being pulled LOW to command the selected memory device to read (MRDC is active LOW). After MRDC signal is activated, the DEN signal is activated to enable the 8286 transceiver; thereby, allowing the data present on the system data bus to transfer into the local processor bus.

All subsequent bus cycles work in a similar manner. Note that there are also three write cycles seen above in the capture. In case of a write cycle, DT/R is set HIGH and DEN is activated prior to MWTC to present the data to be written to the system bus before a write command is issued.

Either way, it seems like 8288 is still alive after all these years and ready to be used in this project.

System/14: System Memory Map

System/14 address space layout is based on that of the IBM PC and other compatible systems in order to support standard DOS operating systems and applications.

The following table shows the address space layout:
RANGE        SIZE   TYP  DESCRIPTION
00000-003FF   1024  RAM  Interrupt Vector Table (IVT)
00400-004FF    256  RAM  BIOS Data Area (BDA)
00500-9FFFF 654080  RAM  Free
A0000-BFFFF 131072  RAM  Video RAM (separate from main memory)
C0000-C7FFF  32768  ROM  Video BIOS
C8000-EFFFF 163840  ROM  Misc. Hardware BIOS Area (e.g. Disk Adapter)
F0000-FFFFF  65536  ROM  System BIOS

Interrupt Vector Table (IVT) is used by the 8086 CPU to dispatch appropriate interrupt service routines. 8086 CPU can service up to 256 different interrupts (00-FF) and the address of each interrupt service routine is stored in the IVT. The address of the interrupt service routine is stored in SEGMENT:OFFSET format, where the first two bytes of each entry represent the segment address and the following two bytes represent the offset address of an ISR.

BIOS Data Area (BDA) is a reserved RAM area for System BIOS (also known as ROM BIOS) operation. System BIOS utilises this area for storing various system operational data and this data may be used by the OS to obtain system information.

00500-C7FFF is freely available RAM area that can be utilised by the OS and applications.

Video RAM Area is a RAM area separate from the main RAM area that is located within the video module. This area is used as graphics and text mode video buffers.

Video BIOS Area is a ROM area located within the video module. This ROM area contains the program code for video BIOS operation. The System BIOS scans for this area during the POST and invokes the Video BIOS initialisation procedure if present.

Misc. Hardware BIOS Area is a reserved area that may be decoded by various peripherals present on the system/peripheral bus. According to the PC-XT Hardware Reference Library, the absolute addresses hex C8000 through F4000 are scanned in 2K blocks in search of a valid adapter card ROM.

System BIOS Area is a ROM area that is occupied the System BIOS and other firmwares. The actual program code of the System BIOS begins from F000:E000 and the area before this may be utilised by the ROM BASIC (if present) or other hardware-specific system firmwares.

 

Errata 1 (06/06/2014)
00500-C7FFF 817920  RAM  Free is an invalid range specification as video devices were assigned at A0000-C7FFF. Revised to 00500-9FFFF 654080  RAM  Free.

System/14: 8284 Clock Generator

Intel 8284 is a clock generator/driver intended for use with MCS-86 family processors. 8284 serves as the primary clock source to the 8086 processor and 8087/8089 co-processors.

8284 provides three different clock outputs: OSC, CLK, PCLK. OSC is the buffered crystal frequency output, CLK is 1/3 of the active clock source frequency with 1/3 duty cycle intended for system processors and processor local bus devices, and PCLK is 1/2 of the CLK frequency intended for peripheral devices.

Intel 8284A Pinout8284A Pinout

 

8284 can generate its outputs from either an oscillator crystal connected to X1/X2 with an appropriate loading capacitor or an external driven clock source fed into EFI pin. An external driven clock (EFI) can be selected by strapping F/C pin to HIGH and an oscillator crystal can be selected by strapping F/C pin to LOW.

Besides the clock outputs, 8284 also provides RESET and READY outputs. The RESET output of 8284 is connected to all devices on the processor local bus (8086, co-processors, and other supporting devices if present) to provide a system-wide reset. RESET is derived from the RES pin connected to an RC timing circuit. The READY synchronisation output of 8284 provides a CPU-clock (CLK) synchronised indication to 8086 when an asynchronous device (ASYNC=LOW) or a slow memory or I/O device (ASYNC=HIGH) requires a bus cycle extension. READY is derived from RDY1 and RDY2 inputs, which are enabled by strapping AEN1 and AEN2 low, respectively.

 

8284 BreadboardThe picture above shows Intel 8284A breadboarded for testing after not being used for more than 20 years. The clock generator was configured to use oscillator crystal source by pulling F/C low and a 12MHz crystal was connected to X1 and X2 pins. OSC, CLK and PCLK pins were connected to the logic analyser for functional verification.


8284 Logic Analyser CaptureThe logic analyser waveform shows that the 8284A is still alive and functioning normally. OSC is the buffered direct output of the crystal frequency and CLK is 1/3 frequency of OSC (crystal) with 33% duty cycle, and PCLK is 1/2 frequency of CLK with 50% duty cycle as shown above.

 

System/14: IC Collection

This is a brief update on the IC collection of my System/14 project. I have collected these rather ancient ICs in CERDIP package over the past few years for use in a 8086 computer project and now I have enough of them to build a functional 8086 computer system.

System/14: IC Collection

The following is a list of important ICs in the collection:

  • 8086 Processor
    • SIEMENS SAB 8086-1-C: Intel 79, Intel 8086-1 produced by Siemens; CDIP-40
    • INTEL D8086-2, 1: Intel 79; I3490012; CDIP-40
    • INTEL D8086-2, 2: Intel 84; L7050373; CDIP-40
  • 8087 Math Coprocessor
    • INTEL C8087-2: Intel 84; L5180067; CDIP-40
    • INTEL D8087-2: Intel 84; L5271128; CDIP-40
  • 8089 I/O Coprocessor
    • INTEL D8089A-3, 1: Intel 03; L8103097; CDIP-40
    • INTEL D8089A-3, 2: Intel 03; L8103097; CDIP-40
  • 8284 Clock Generator
    • INTEL D8284A: L8470346; CDIP-18
  • 8288 Bus Controller
    • INTEL D8288, 1: Intel 79; L5490446; CDIP-20
    • INTEL D8288, 2: Intel 79; L4196874; CDIP-20
  • 8287 Octal Transceiver
    • INTEL D8287, 1: Intel 79; L4036079; CDIP-20
    • INTEL D8287, 2: Intel 79; L1446091; CDIP-20
  • 8259 Programmable Interrupt Controller
    • INTEL D8259A, 1: Intel 80; L3387004; CDIP-28
    • INTEL D8259A, 2: Intel 76; 0506F; CDIP-28
  • 8237 Programmable DMA Controller
    • INTEL D8237A-3, 1: AMD 83; L6510270; CDIP-40
    • INTEL D8237A-3, 2: AMD 83; L6510270; CDIP-40
  • 8253 Programmable Interval Timer
    • INTEL D8253-5, 1: Intel 80; L2365009; CDIP-24
    • INTEL D8253-5, 2: Intel 80; U3371853; CDIP-24
  • 8255 Programmable Peripheral Interface
    • INTEL C8255: Intel 76, 14; 3204A; CDIP-40
  • 82586 Ethernet LAN Coprocessor
    • INTEL C82586: Intel 82; S52069; L5400002; CDIP-48
  • 82720 Graphics Display Controller
    • INTEL C82720-31: Intel 83, 49; S52043; V3520020; CDIP-40

System/14: Preface

System/14 is my upcoming weekend MCS-86 computer project. In this project, I will be developing an IBM PC binary-compatible 8086 computer system with the original integrated circuit components from the late 70s and early-to-late 80s.

The final goal of the project is to produce a multi-board computer system that can run standard DOS operating systems and applications without any modifications.

In order to ensure the compatibility with the IBM PC operating systems and applications, all critical system memory and I/O locations will be replicated into its memory and I/O maps. All standard system devices (e.g. 8259 PIC, 8253 PIT) will be present on the system board either as a discrete IC or an FPGA-based chipset to replicate the standard IBM PC system device behaviours.

The project will be completed in three separate phases:
Phase 1: Minimal single-board 8086 computer with no display, minimal IBM PC compatibility
Phase 2: Single-board 8086 computer with 8087 and 8089 co-processors, monochrome video output, full IBM PC compatibility
Phase 3: Multi-board card cage construction with the processor board and various peripheral boards, full IBM PC compatibility

NOTE: The project roadmap provided above is no longer up-to-date. Please refer to the project page for the latest version of roadmap.

Interfacing with PS/2 Devices

PS/2 interface was first introduced in 1987 along with the release of the IBMs Personal System/2 series (you can see where the name comes from) in order to provide a standard interface for connecting keyboard and mice to the computer systems. It has served as the de-facto PC standard for last few decades. Although it is now considered obsolete and its use is deprecated, it is still a very important standard when it comes to commercial and industrial computing. For example, modern KVM switches and cables still utilise PS/2 for keyboard and mice interface and almost all industrial computers still support PS/2 interface. It is also important to note that PS/2 interface is preferred over USB to these days by certain PC users due to the simple fact that USB HID, by its protocol design, cannot support more than 6 simultaneous key presses.

1. Connector
6-position Mini-DIN connector is used for PS/2 bus connection. The following is the illustration of the female PS/2 connector from Wikipedia:


1  DATA          4  VCC
2  NC            5  CLK
3  GND           6  NC

 

2. Bus Interface
As observed from the pin assignment table next to the illustration of the PS/2 connector above, the interface is essentially a simple synchronous serial port consisting of one data and clock line. For electrical specifications, VCC must be at +5V +/-10% respective to the GND, and capable of supplying up to 250mA. VL is 0.0 to 0.7V, and VH is +2.4V to +5.5V according to the original PS/2 Model 25 Technical Reference documentation (No. 84X0672). CLK should be operated within 10 to 30kHz range. The bus is bi-directional and both DATA and CLK are open-collectors pulled up to VCC.

The following is a simplified diagram of the bus interface:

Note that nDATA and nCLK outputs are inverted (active low) and it should be taken into account when implementing the control logic. Also since the operating logic levels are TTL, it may be necessary to place a level shifter before interfacing with any modern LVTTL/LVCMOS microcontrollers that do not support TTL logic levels.

 

3. Data Frame
A data frame is the smallest unit of transmission on the bus. Any data to be transmitted on the bus must be divided into and sent in data frames. Each data frame consists of 8-bit data, one bit parity and two control bits, totalling up to 11 bits (in host-to-device communication scenario, it may be up to 12 bits).

The following table illustrates the structure of a data frame:

Note that, since the bi-directional bus interface utilises the open-collector format, the tri-stated (high-impedance state) bus is always at HIGH level. From that we can intuitively conclude that START and STOP bits will be 0 and 1, respectively. As stated in the table, the data bits are transmitted starting with the LSB (least significant bit). Parity is an odd parity bit that is dependent on the number of 1s in the data bits. If the number 1s in the data bits is equal to an even number, the parity bit is equal to 1 and vice versa (# of 1s + parity bit = always odd #). The 12th bit in the sequence, ACK, is only present in the host-to-device data frames, in order to allow the device to acknowledge that the data frame has been successfully received.

 

4. Bus Control Procedure
PS/2 is a bi-directional bus and always consists of two bus elements: host and device (no more, no less). Host usually refers to an element that receives the input, or simply a computer. Device refers to an element that provides the input, or simply keyboard and mice.

Host reserves the right to disrupt the communication by a device at any time and the device must respond accordingly. For example, while a device is transmitting a data frame to a host, the host may decide to interrupt the device data frame and send its data frame to the device. In this case, the device must abort the current transmission and buffer the data frame that it was transmitting at the time and re-attempt the transmission after the host-to-device communication is complete.

A device-to-host communication begins when the device starts generating CLK and sending a data frame. In this communication process, the device always sets the DATA line at the rising edge (or high state, in case of START bit) of the CLK, and host reads the DATA line at the falling edge of the CLK. The CLK is generated until all bits in the data frame are transmitted (including STOP bit). The following is the timing diagram for device-to-host data transmission:

Note that this is an ideal representation of the signal timing. There may be a certain delay period (set-up delay) until the DATA is valid after the rising edge of the CLK in physical implementations. As long as this delay is less than 1/2 period of one clock cycle, this does not pose a problem. The DATA need only be valid at the falling edge of the CLK cycle.

A host-to-device communication is initiated when the host drives CLK to LOW for at least 100μs. After this, host releases the CLK line, drives DATA to LOW, and waits for the device to generate CLK (it may take a few milliseconds for the device to respond). As soon as the device starts generating the CLK, the host begins transmitting accordingly. It is very important to note that in this process, the host sets the DATA line at the falling edge of the CLK, and the device reads the DATA line at the rising edge of the CLK, whereas in the device-to-host communication procedure, the opposite holds true. In short, the DATA line must be valid at the rising edge of the CLK. The host, after sending the PARITY bit, releases the DATA line (thereby, setting a stop bit) and waits for a falling edge in the CLK, and verifies the ACK bit (0 as suggested by common sense: the DATA line is pulled up open-collector). Note that the device will verify the stop bit (line release) and, at the same, drive the DATA to LOW to form an ACK bit (the edge triggering reverses, device will set the DATA at the rising edge of the CLK for ACK bit, compared to the falling edge by host for the rest of the bits). The following is the timing diagram for host-to-device data transmission:

Once again, note that this is an ideal representation of the timing. The rule of thumb is to have the DATA line valid at every rising edge of the clock cycle. In case of ACK, as previously mentioned, its value is set by the device (not by the host) and expected to be valid at the falling edge of the specific clock cycle.

 

p.s. I decided to write this article because Ive recently acquired a 1987 122-key IBM Model M keyboard with PS/2 port (the legendary buckling spring mechanical keyboard, but with 122 keys) and had to design an active PS/2-to-USB interface/protocol/scan code translator.

Electric Bicycle Control Power Supply

Following the update on the I/O subsystem of my electric bicycle project, this is (not much of) the design for the 5V Control Power Supply Unit (CPSU).

The main component of the CPSU is the LM2678T-5.0 switching regulator from TI with the peak efficiency up to 87% (you will NEVER get this with any linear regulators, especially doing 24/12V to 5V step-down conversion).

The following is a quick look at the awesomeness of the LM2678:

LM2678: Efficiency vs Input VoltageLM2678: Efficiency vs Load Current

You can observe from the plots above that the conversion efficiency for 5V operation stays at around 85% or above given that you operate within the reasonable amount of current limit (less than 5A) and the input voltage is not too high (well, since the regulator is going to be powered from either the half or full tap on the 24V battery bank, the efficiency will always be at least 85% if not higher at most of the times).

The following is the schematic of the CPSU which is pretty much straight off from the LM2678 application note with some minor adjustments (most importantly, lack of the input side electrolytic capacitors as the supercaps will do it instead):

CPSU: Schematics

What you see above is essentially a conventional switching regulator circuitry with a input-side high frequency noise suppressing cap, schottky diode, inductor output and feedback, and some (relatively) large output-side filtering caps.

Following the schematics is the PCB layout:

CPSU: PCB Layout

 

Electric Bicycle Module I/O Subsystem

Once again, slowly uploading some of the important bits from the early development stage of my electric bicycle project, this one is about the I/O subsystem for all subsystem modules.

If you recall from the previous article (Electric Bicycle System Architecture), all control subsystems (including, but not limited to, Battery Subsystem, Capacitor Subsystem and Motor Subsystem) have a dedicated bus called DAC (data acquisition and control) serial. This is the ultimate data bus intended for simple communication between the Main Control System and all subsystems.

The bus consists of four lines: SEL[0], SEL[1], DCLK, DATA. As the name suggests, SEL[1:0] are used for selecting the target parameter number (either data retrieval, or control command), and DCLK and DATA are used for synchronous serial data bit retrieval. You may think by now that the bus is not truly serial and rather seems like an inefficient abridged version of I2C with select lines. Nonetheless, while this may be true, at the time of development, I was considering using only the classic through-hole packages (I really did not feel like working with SMDs and did not have any proper hot air rework station for them either) and trying to utilise as many 7400 series ICs as I could (yes, Im quite a fan of it to be honest, even to the degree that I once almost planned to build a microprocessor using them).

The following is the I/O subsystem schematics from the Battery Monitoring Unit (BMU, equivalent to Battery Subsystem in version 1 design):

BMU: I/O Subsystem

In here, you see the ADS7822P ADC for metering the BMU voltage and current flow. The signal sources not shown here (VPROBE, IPROBE+/-) are, as you can imagine, quite straight forward- voltage divider for VPROBE and across the current shunt resistor for IPROBE).

The role of 74139 and 74153 is essentially chip selecting and data multiplexing. 74139 decodes the SEL[1:0] and uses the decoded output to select the target ADC. 74153 then  multiplexes the selected chip output into the bus DATA line.

Electric Bicycle System Architecture

It has been a while since I wrote my previous article on the electric bicycle project. To be honest, Ive had almost no time to work on it as my other work projects kept me busy (and somewhat excited as well). Since I have yet another exciting project for work coming up and will not be able to work on this project for a long time (or even never finish it), I will upload what I have finished at the moment.

By the way, the electric bicycle itself is currently in working condition with the components that I designed a while ago (sorry, I had no time to document or post about the process as I was quite in hurry to get the bicycle into working condition).

Simplified System Diagram

 

If you take a quick look at the design, you can observe that the system is highly modularised and each component is assigned a specific range of tasks to look after. The following is the list of the component modules in the system and a brief description of what each module does:

  • Battery Subsystem: provides the first-level circuit protection from the battery, monitors the battery level and current flow, performs battery circuit switching
  • Capacitor Subsystem: controls the capacitor energy flow direction, monitors the capacitor charge level and current flow
  • Motor Subsystem: controls the motor energy flow direction, provides motor power control functionality, monitors motor voltage and current flow
  • Redirection Subsystem: controls high energy subsystem and power bus connectivity (to bus LO/HI)
  • DC-DC Converter Subsystem: performs voltage conversion task (either step-up or step-down) between power buses LO/HI.
  • Control Power Supply: steps down the system power bus voltage (normally the battery voltage) to the control voltage and provides a regulated output to the control power bus.
  • Main Control System: monitors and controls all subsystems

With the help of super-efficient (up to 98%) switching power conversion technologies, this system design allows the power components to operate at almost any voltage (of course, not of a ridiculous value). This is a major advantage in an electric bicycle because the battery voltage (especially in a small scale like this) tends to heavily vary on the amount of load placed on it. This issue may affect the motor power output if the system is linearly driven. Moreover, in order to utilise the regenerative braking to its maximum potential, the ability to adapt to any operating voltages is essential as the regenerative braking capacitor subsystem must reach at least the battery voltage and a voltage higher than the output voltage of the motor in order to achieve power transfer to its intended value).

The following are some exemplary scenarios:

  • System Initial Start-up: the outdoor temperature is -25 centigrade and the lead acid battery has a limited ability to deliver enough current to provide a reasonable amount of acceleration to the bicycle
      1. (Redirection Subsystem) connect Battery Subsystem to LO BUS/Capacitor Subsystem to HI BUS
      2. (DC-DC Converter Subsystem) operate in step-up mode LO BUS to HI BUS
      3. (Capacitor Subsystem) engage energy inflow mode
      4. (Redirection Subsystem) connect Motor Subsystem to HI BUS once a voltage value required to provide a reasonable amount of acceleration to the bicycle has been reached
  • Battery Voltage Low: the battery voltage has dropped to 21.8 V after long hours of operation; however, the battery still has some juice left (yes, I picked up this phrase from Photonicinduction), the motor can only operate at 60% of its power rating at this voltage.
      1. (Redirection Subsystem) connect Battery Subsystem to LO BUS/Motor Subsystem to HI BUS
      2. (DC-DC Conversion Subsystem) operate in step-up mode LO BUS to HI BUS
  • Regenerative Braking: regenerative braking system is engaged. At this instant, the motor voltage is 34 V, the capacitor bank voltage is 23 V, the battery voltage is 22.7 V.
      1. (Redirection Subsystem) connect Motor Subsystem to HI BUS/Capacitor Subsystem to HI BUS/Dynamic Power Dissipation Subsystem to HI BUS/Battery Subsystem to LO BUS
      2. (Capacitor Subsystem) engage energy inflow mode
      3. (DC-DC Conversion Subsystem) operate in step-down mode HI BUS to LO BUS
      4. (Battery Subsystem) [stateful] disconnect once the capacitor bank voltage reaches the battery voltage
      5. (Dynamic Power Dissipation Subsystem) [stateful] engage if the capacitor voltage rises beyond its operating limit (this is to provide enough motor braking capacity when the regenerative braking capacitors are fully charged and are no longer drawing any significant amount of current)

It is probably also important to mention about its potential use in motor speed control (that is, if you have not noticed yet). By operating the DC-DC conversion subsystem in the battery and capacitor-motor parallel configuration, it is possible to control the average power transfer to the motor (of course, this is the story when the motor is a conventional DC motor, not BLDC or induction motors)

A Quick Guide to LanSchool Takeover

In remembrance of my high school days (well, only half a year ago to be exact) and for all the pranksters out there wanting to takeover their schools, I publish this quick guide to LanSchool takeover. As long as you are competent in this field, you will find this article very useful in implementing YOUR OWN LanSchool control application.

I am not sure about the older (below 5.x) and newer versions of LanSchool since I conducted my research and development some time last year (during the winter of 2011); however, I believe there should be no problem applying the contents of this article to the newer versions since the reverse engineering data contained within was never published online.

Haxchool

First, lets start with a brief overview of LanSchool.

  • LanSchool Student: is an application that is deployed all over the school computers (of course, that is if your school uses LanSchool for monitoring and control). It is essentially a client that listens on a specific port (both UDP and TCP) for control packets. You can think of it as an intentional backdoor installed by the school admins.
  • LanSchool Teacher: is a special application designed to send control packets to the LanSchool Student applications on the local network.

In this case, what we are attempting to create is an application equivalent to LanSchool Teacher- the ultimate application that allows you to control all computers on the school network.

The idea is simple: send out the packets that LanSchool Student will accept as legitimate packets from LanSchool Teacher.

In the older versions of LanSchool implementations, this process was extremely simple. LanSchool Student application would accept any properly formatted packets sent to the right port and do whatever the received packet commands it to do. This caused a major problem in the schools using LanSchool and the company responsible for developing this horribly designed software was forced to add a verification mechanism for the packets being sent over the network: packet encryption (well, too bad for them because I am just about to publish their packet encryption algorithm for everyone :p).

< LanSchool Packet Structure >

UInt8 Type1;        // 00
UInt8 Type2;        // 01
UInt16 Channel;     // 02
UInt16 Magic;       // 04
UInt32 Length;      // 06
... [Optional:
Type-specific Data] // 0A

< LanSchool Encryption: Cipher Key Generation >

The cipher key is dependent on the sender IPv4 address. LanSchool Student verifies the received packet by attempting to decrypt the packet with a cipher key generated by remote IP address (sender IP address). If the decrypted packet fails the packet format test, the packet is not valid. Therefore, in order to feed LanSchool Student with your counterfeit packets, it is essential that you generate a proper cipher key specific to the IPv4 address of your transmitting network adapter.

The cipher key is generated as follows:

byte[] GenerateKey(uint senderIP)
{
    uint tempValue = senderIP;
    for (uint i = 0; i < 6; i++)
        tempValue = tempValue * 0x343FD + 0x269EC3;
    uint keyInteger = (tempValue & 0x7FFF0000) |
        (((tempValue * 0x343FD + 0x269EC3) & 0x7FFF0000) >> 16);
    byte[] keyBytes = new byte[4];
    keyBytes[0] = (byte)keyInteger;
    keyBytes[1] = (byte)(keyInteger >> 8);
    keyBytes[2] = (byte)(keyInteger >> 16);
    keyBytes[3] = (byte)(keyInteger >> 24);
    return keyBytes;
}

< LanSchool Encryption: Packet Cipher >

LanSchool only encrypts the first 10 bytes except [2-3] (header) of the packet; the rest is sent and received unencrypted. The packet cipher consists of two fundamental operations and four fundamental operation sequences.

The following are the cipher fundamental operations:

byte EncipherSubOp1(int keyIndex, byte data)
{
    byte temp = (byte)(_key[keyIndex] ^ data);
    if ((temp & 0x80) != 0)
        return (byte)((temp * 2) | 0x01);
    else
        return (byte)(temp * 2);
}

byte EncipherSubOp2(int keyIndex, byte data)
{
    byte temp = (byte)(_key[keyIndex] ^ data);
    if ((temp & 0x01) != 0)
        return (byte)((temp >> 1) | 0x80);
    else
        return (byte)(temp >> 1);
}

The following are the cipher operation sequences:

private byte EncipherOpAlpha(byte data)
{
    byte temp;
    temp = EncipherSubOp1(1, data);
    temp = EncipherSubOp2(0, temp);
    temp = EncipherSubOp2(3, temp);
    temp = EncipherSubOp1(2, temp);

    return temp;
}

private byte EncipherOpBeta(byte data)
{
    byte temp;
    temp = EncipherSubOp2(2, data);
    temp = EncipherSubOp2(1, temp);
    temp = EncipherSubOp2(0, temp);
    temp = EncipherSubOp1(3, temp);

    return temp;
}

private byte EncipherOpGamma(byte data)
{
    byte temp;
    temp = EncipherSubOp1(2, data);
    temp = EncipherSubOp2(1, temp);
    temp = EncipherSubOp1(0, temp);
    temp = EncipherSubOp1(0, temp);

    return temp;
}

private byte EncipherOpDelta(byte data)
{
    byte temp;
    temp = EncipherSubOp2(3, data);
    temp = EncipherSubOp1(2, temp);
    temp = EncipherSubOp1(1, temp);
    temp = EncipherSubOp2(0, temp);

    return temp;
}

The following is the Encrypt routine:

void Encrypt(byte[] data)
{
    data[0] = EncipherOpAlpha(data[0]);
    data[1] = EncipherOpBeta(data[1]);
    data[4] = EncipherOpAlpha(data[4]);
    data[5] = EncipherOpBeta(data[5]);
    data[6] = EncipherOpGamma(data[6]);
    data[7] = EncipherOpDelta(data[7]);
    data[8] = EncipherOpAlpha(data[8]);
    data[9] = EncipherOpBeta(data[9]);
}

You can easily figure out and implement Decrypt routine by writing the inverses of the cipher fundamental operations (gluck with that).

The following are the excerpts from my Haxchool application packet generator library for reference:

// ShellExecute
//  Private
//  TCP 59, 03
public static HaxPacket ShellExecute(ushort channel, string senderName, string commandLine)
{
    HaxPacket packet = new HaxPacket(59, 3, channel);
    packet.AddDWord(0x00000000);
    packet.AddDWord((uint)senderName.Length);
    packet.AddString(senderName);
    packet.AddBytes(new byte[64 - senderName.Length]);
    packet.AddDWord((uint)commandLine.Length);
    packet.AddString(commandLine);
    packet.AddDWord(0x00000000);
    return packet;
}

// Shutdown/Restart/Logoff
//  Private
//  TCP 30, 03
public static HaxPacket Shutdown(ushort channel, string senderName)
{
    HaxPacket packet = new HaxPacket(30, 3, channel);
    packet.AddDWord(0x0000000C);
    packet.AddDWord((uint)senderName.Length);
    packet.AddString(senderName);
    packet.AddBytes(new byte[64 - senderName.Length]);
    return packet;
}

 

I hope this disclosure of the encryption algorithm would help all the pranksters out there :) Good luck implementing your own.

p.s. I will not accept any questions about this article.

Operating System IX (Pt. 2, Kernel)

Continuing on from the previous article, we will discuss about the structure of the kernel of the Operating System IX in this article. If you have not read the first part (Pt. 1, Initialiser) yet, click here.

First, lets take a quick look at the Main function of the kernel where all initialisation routines are called (excerpt from Main.cpp):

extern "C" Void Main()
{
    // Obtain the MultiBoot Information
    _MultiBoot_BootInformation *MultiBootInformation;
    asm("" : "=b"(MultiBootInformation));
    // Initialise IDT
    IDT::Initialise();
    // Initialise TSS
    TSS::Initialise();
    TSS::SetIST(1, (UInt64)InterruptStack + _System_InterruptStackSize);
    // Initialise SOID
    InitialiseSOID(MultiBootInformation);
    // Initialise Physical Memory Page Manager
    PMPM::Initialise(MultiBootInformation);
    // Initialise Physical Memory Manager
    PMM::Initialise(MultiBootInformation);
    // Initialise Task Manager
    Task::Initialise();
    // Initialise Scheduler
    Scheduler::Initialise();
    // Drivers Initialisation
    // - Initialise Programmable Interrupt Controller
    PIC::Initialise();
    // - Initialise Programmable Interval Timer
    PIT::Initialise();
    // - Initialise Keyboard Controller
    KBC::Initialise();
    // - Initialise Console
    Console::Initialise();

    _Task_Context *PP = new _Task_Context();
    PP->Type = _Task_Type_System;
    PP->RIP = (UInt64)ITFunc;
    Task::AddTask(PP);

    _Task_Context *PP2 = new _Task_Context();
    PP2->Type = _Task_Type_System;
    PP2->RIP = (UInt64)ITFunc2;
    Task::AddTask(PP2);

    // Initialise Interrupt
    Interrupt::Initialise();
    Interrupt::Set(true);
    // Boot Message
    Console::GetSelectedConsole()->Write("Operating System IX // Version ");
    Console::GetSelectedConsole()->WriteLine("%d.%d.%d [%d/%d/%d]", _PSI_Version_Major, _PSI_Version_Minor, _PSI_Version_Revision, _PSI_Version_Date_Day, _PSI_Version_Date_Month, _PSI_Version_Date_Year);
    Console::GetSelectedConsole()->WriteLine(" Copyright (C) 2004-2009 Paradoxoft Corporation.");
    Console::GetSelectedConsole()->WriteLine(" Copyright (C) 2009 Stephanos San Io.");
    Console::GetSelectedConsole()->Write('\n');
    // Jump to the System Task
    SystemTask();
}

The beginning of the kernel Main routine, similarily, starts with obtaining the pointer to the multiboot information header stored in EBX register. This value of the register is either preserved from GRUB or explicitly set to point to the GRUB-provided or artificial multiboot-compatible information header (note that the IX design philosophy is to support various execution environments and the kernel, therefore, is not restricted to the usage of default OS-provided initialiser code.) The multiboot information header is used in various steps of the kernel initialisation procedure as a primary source of base system information.

The first real step of the kernel initialisation procedure starts with the initialisation of the processor control structures. The first structure component to be initialised is the Interrupt Descriptor Table (IDT). The following is the code of IDT::Initialise routine (excerpt from IDT.cpp):

Void IDT::Initialise()
{
	// NOTE: IDT is at 0xFFF000
	// IDTR
	_IDT_Descriptor IDTR;
	IDTR.Limit = 0xFFF;
	IDTR.Base = (UInt64)_SystemIDT;
	asm("lidt [%0];" : : "m"(IDTR));
}

The IDT::Initialise routine essentially constructs an IDTR (IDT register) structure and loads the structure into the processor IDTR (LIDT instruction). The following is the structure and the graphical representation of the IDTR (from Intel® 64 and IA-32 Architectures Software Developers Manual, Aug 2012, Figure 6-1. Relationship of the IDTR and IDT):

Operating System IX: Relationship of the IDTR and IDT

Note that IDTR.Base (IDTR[16:47]) is set to the base physical address of the IDT specified by the IDT::_SystemIDT pointer which is set to 0xFFF000 and the IDTR.Limit (IDTR[0:15]) is set to 0xFFF; therefore, specifying an IDT region of 0xFFF000-0xFFFFFF which is sufficient for 256 descriptors (each descriptor is 16-byte, 16 * 256 = 4096 = 01000).

Also note that no descriptor in the table is initialised at this time. All required descriptors (for exception handling, system operation, hardware I/O) are initialised later on in specific initialisation subroutines. This does not cause a system failure becuase all software interrupts were blocked by CLI instruction and all hardware interrupts, including NMI, were disabled on PIC (Programmable Interrupt Controller).

Following the initialisation of the IDT, the TSS (task-state segment) is initialised. Despite the name of the structure, in 64-bit mode, TSS is no longer used for hardware task switching. In fact, AMD, when they first designed the x86-64 architecture, decided to completely abandon the hardware task switching capability for various performance reasons (the process is just not intelligent enough to be efficient over the selective software task switching which allows only specific registers and other system resources to be switched; thereby, reducing the amount of data operations per task switching). Instead, the TSS is used for storing stack pointers to be used in privilege-level switching and interrupt calls. The following is the structure of the TSS (Figure 7-11):

Operating System IX: 64-bit TSS Format

RSP[0-2] are, as the name suggests, the addresses of the stack pointers for each privilege level (Ring 0 to 2) and are used to set the value of the RSP when privilege-level transition occurs. IST[0-7] are the interrupt stack table entries and each entry specifies different stack base addresses to be used by interrupt calls. The interrupt stack table index is specified in the interrupt descriptor of IDT.

Since we have some understanding about the 64-bit TSS now, lets take a look at the code of TSS::Initialise procedure:

Void TSS::Initialise()
{
	// NOTE: TSS is at 0xFF9000
	// TSS
	_SystemTSS->Reserved0 = 0;
	_SystemTSS->RSP[0] = Null;
	_SystemTSS->RSP[1] = Null;
	_SystemTSS->RSP[2] = Null;
	_SystemTSS->Reserved1 = 0;
	_SystemTSS->IST[0] = Null;
	_SystemTSS->IST[1] = Null;
	_SystemTSS->IST[2] = Null;
	_SystemTSS->IST[3] = Null;
	_SystemTSS->IST[4] = Null;
	_SystemTSS->IST[5] = Null;
	_SystemTSS->IST[6] = Null;
	_SystemTSS->Reserved2 = 0;
	_SystemTSS->Reserved3 = 0;
	_SystemTSS->IOMap = Null;
	// TR
	asm(" \
	ltr ax; \
	" : : "a"(_Selector_TSS64));
}

What a pity. There isnt that much interesting in TSS::Initialise routine. The first few lines of the code simply set all entries to Null (0) value, just another crucial initialisation routine. However, we do see something a bit interesting in the last few lines. If you recall from the Part 1 of this article, you will notice that the GDT set by the initialiser contains a special descriptor entry with the comment 64-bit TSS Descriptor. The _Selector_TSS64 constant is the selector for the TR (task register) pointing to the index of that descriptor (yes, this is indeed a strange way of specifying a structure base address).

Following the initialisation of the TSS is the initialisation of its entry; in this case, only IST logical index 1 (IST[0]). IST[0] is used for all interrupts in OSIX at the moment (no more than one IST is required by the current design implementation).

After the TSS initialisation, the System Operating Information Descriptor (SOID) is initialised by calling InitialiseSOID function with the multiboot information header pointer as its parameter. The SOID is an operating system-specific structure that OSIX uses to keep track of all important system information and data structures. The following is the structure definition of the SOID (from System.h):

// System Operating Information Descriptor (SOID)
typedef struct
{
	// Physical Memory
	UInt64 PhysicalMemorySize; // Size of the physical memory in bytes
	UInt64 PhysicalPages; // Number of physical pages

	// Physical Memory Page Manager
	UInt64 *PMPM_Table;

	// Physical Memory Block Manager

	// Physical Memory Manager
	_PMM_BlockHeader *PMM_FirstBlock;

	// Tasks
	UInt64 Task_Count;
	_Task_Context *Task_FirstContext;
	_Task_Context *Task_LastContext;
} _SOID;

The following is the code of InitialiseSOID routine (from Main.cpp):

Void InitialiseSOID(_MultiBoot_BootInformation *MultiBootInformation)
{
    SOID.PhysicalMemorySize = ((UInt64)MultiBootInformation->MemoryLower + (UInt64)MultiBootInformation->MemoryUpper) * 1024;
    SOID.PhysicalMemorySize -= SOID.PhysicalMemorySize % 4096;
    SOID.PhysicalPages = SOID.PhysicalMemorySize / 4096;
    UInt64 NumberOfPDPs = (SOID.PhysicalMemorySize / 0x80000000000ULL) + ((SOID.PhysicalMemorySize % 0x80000000000ULL) ? 1 : 0);
    UInt64 NumberOfPDs = (SOID.PhysicalMemorySize / 0x40000000ULL) + ((SOID.PhysicalMemorySize % 0x40000000ULL) ? 1 : 0);
    UInt64 NumberOfPTs = (SOID.PhysicalMemorySize / 0x200000ULL) + ((SOID.PhysicalMemorySize % 0x200000ULL) ? 1 : 0);
    UInt64 TotalPagingStructuresSize = (NumberOfPDPs + NumberOfPDs + NumberOfPTs + 1) * 4096;
    UInt64 Oversized = TotalPagingStructuresSize % 0x100000;
    SOID.PMPM_Table = (UInt64 *)(0x1000000 + (TotalPagingStructuresSize + (0x100000 - (Oversized == 0 ? 0x100000 : Oversized))));
    SOID.PMM_FirstBlock = (_PMM_BlockHeader *)(0x1000000 + (TotalPagingStructuresSize + (0x100000 - (Oversized == 0 ? 0x100000 : Oversized))));
}

The first two lines of the code calculates the total physical memory size by adding the lower and upper memory area sizes provided by the multiboot information header, and the next line calculates the number of physical pages by dividing the total physical memory size by the unit page size (4096 bytes in this case, since OSIX uses 4-KiB paging mode). The next few lines calculated the total size of paging structures (number of PDPs + PDs + PTs + one PML4 * 4096, since each table of all types is 4096 bytes in size). The purpose of this calculation is to find an appropriate base address for the Physical Memory Page Manager table used for identifying the state of every page available for addressing the physical memory.

Following the initialisation of the SOID is the initialisation of the Physical Memory Page Manager (PMPM). The following is the code of the PMPM::Initialise routine (from PMPM.cpp):

Void PMPM::Initialise(_MultiBoot_BootInformation *MultiBootInformation)
{
	UInt64 PMPMTableSize = ((UInt64)SOID.PhysicalPages) * 8;
	// Initialise all the table entries as Not Allocated.
	for (UInt64 i = 0; i < SOID.PhysicalPages; i++)
		SOID.PMPM_Table[i] = _PMPM_AllocationCount_NotAllocated;
	// Set the reserved zone
	// - Low 16 MB + Paging Structures + PMPM Table
	UInt64 TotalEndPageNumber = (((UInt64)SOID.PMPM_Table) / 4096) + (PMPMTableSize / 4096) + (((PMPMTableSize % 4096) == 0) ? 0 : 1);
	for (UInt64 i = 0; i < TotalEndPageNumber; i++)
		SOID.PMPM_Table[i] = _PMPM_AllocationCount_Reserved;
	// - Reserved Area Information provided by the GRUB
	_MultiBoot_MemoryMapInformation *MemoryMapInformation = (_MultiBoot_MemoryMapInformation *)MultiBootInformation->MemoryMapTable;
	for (UInt64 i = 0; i < (MultiBootInformation->MemoryMapTableLength / 24); i++)
	{
		if (MemoryMapInformation[i].Type != 1)
		{
			UInt64 BasePageNumber = MemoryMapInformation[i].BaseAddress / 4096;
			for (UInt64 i = 0; i < ((MemoryMapInformation[i].BaseAddress / 4096) + (((MemoryMapInformation[i].BaseAddress % 4096) == 0) ? 0 : 1)); i++)
				SOID.PMPM_Table[BasePageNumber + i] = _PMPM_AllocationCount_Reserved;
		}
	}
}

The routine starts with calculating the size of the PMPM table and initialising all table entries with _PMPM_AllocationCount_NotAllocated value indicating that pages are not allocated. The routine then proceeds to identify all reserved areas in the physical memory (used by the system firmware or memory-mapped I/O devices) using the multiboot information header and sets all pages subordinate to the identified areas as _PMPM_AllocationCount_Reserved indicating that the pages are reserved and not for general-purpose allocation.

After the initialisation of the Physical Memory Page Manager (PMPM), the initialisation of the Physical Memory Manager (PMM) follows. Below is the code of PMM::Initialise function (from PMM.cpp):

Void PMM::Initialise(_MultiBoot_BootInformation *MultiBootInformation)
{
	// Create the initial free node
	SOID.PMM_FirstBlock->Length = SOID.PhysicalMemorySize - (UInt64)SOID.PMM_FirstBlock;
	SOID.PMM_FirstBlock->Flags = 0;
	SOID.PMM_FirstBlock->NextHeader = SOID.PMM_FirstBlock->PreviousHeader = Null;
}

The PMM initialisation routine is quite simple: the routine basically creates an initial free node with the size of the remaining physical memory space (total physical memory size PMM_FirstBlock).

Before explaning further, let me elaborate on the internal workings of the Physical Memory Manager. The PMM is essentially a set of operations for a physical memory-wide linked list structure. The entire upper memory (above all initial system structures) is considered to be a linked list with each node representing a memory block. To be more specific, each memory block allocated by the PMM consists of a PMM block header containing the allocation status and the addresses to the previous and next nodes followed by free memory space of the block size. Initially, as explained earlier, a free block located at the beginning of the upper memory area with the size of the entire upper memory is provided. When an allocation operation is requested, the PMM looks for a free memory block (in this case, the initial free node since no other node is present), and subdivides the block into the block of the requested size marked allocated and the rest marked free. In a similar sense, the deallocation operation marks the specified node free and merges the node with other consecutive free nodes (if available) to prevent block fragmentation (if nodes are not merged at the time of deallocation, unless a separate periodic defragmentation routine exists, depending on the frequency of the allocation of small memory blocks, the PMM will eventually run out of free nodes for large block allocation even if the total available free memory space is larger than the requested block size).

The following are some important bits of the PMM (from PMM.h and PMM.cpp):

// From PMM.h
struct _PMM_BlockHeader
{
	UInt64 Length;
	UInt8 Flags;
	_PMM_BlockHeader *NextHeader;
	_PMM_BlockHeader *PreviousHeader;
};

// From PMM.cpp
Void *PMM::Allocate(UInt64 Size, UInt8 Flags)
{
	Size += sizeof(_PMM_BlockHeader);
	Size += 16 - (Size % 16); // align at 16-byte boundary
	_PMM_BlockHeader *CurrentBlock = SOID.PMM_FirstBlock;
	// Search for the empty block
	while ((CurrentBlock->Flags != _PMM_Flag_Free) || (CurrentBlock->Length < Size))
	{
		if (CurrentBlock->NextHeader == Null) // No remaining space
			return (Void *)Invalid;
		CurrentBlock = CurrentBlock->NextHeader;
	}
	// Found
	if (CurrentBlock->Length == Size)
		CurrentBlock->Flags = Flags;
	else
	{
		_PMM_BlockHeader *NewBlock = (_PMM_BlockHeader *)((UInt64)CurrentBlock + Size);
		// - New Block
		NewBlock->Length = CurrentBlock->Length - Size;
		NewBlock->Flags = _PMM_Flag_Free;
		NewBlock->PreviousHeader = CurrentBlock;
		NewBlock->NextHeader = CurrentBlock->NextHeader;
		// - Current Block
		CurrentBlock->Length = Size;
		CurrentBlock->Flags = Flags;
		CurrentBlock->NextHeader = NewBlock;
	}
	return (Void *)((UInt64)CurrentBlock + sizeof(_PMM_BlockHeader));
}

Void PMM::Deallocate(Void *Address)
{
	_PMM_BlockHeader *TargetBlock = (_PMM_BlockHeader *)((UInt64)Address - sizeof(_PMM_BlockHeader));
	TargetBlock->Flags = _PMM_Flag_Free;
	if (TargetBlock->PreviousHeader != Null)
	{
		if (TargetBlock->PreviousHeader->Flags == _PMM_Flag_Free)
		{
			TargetBlock->Length += TargetBlock->PreviousHeader->Length;
			DeleteNode((Void *)TargetBlock->PreviousHeader);
		}
	}
	if (TargetBlock->NextHeader != Null)
	{
		if (TargetBlock->NextHeader->Flags == _PMM_Flag_Free)
		{
			TargetBlock->Length += TargetBlock->NextHeader->Length;
			DeleteNode((Void *)TargetBlock->NextHeader);
		}
	}
}

Void PMM::DeleteNode(Void *Address)
{
	_PMM_BlockHeader *TargetBlock = (_PMM_BlockHeader *)Address;
	if (TargetBlock->PreviousHeader != Null)
		TargetBlock->PreviousHeader->NextHeader = TargetBlock->NextHeader;
	if (TargetBlock->NextHeader != Null)
		TargetBlock->NextHeader->PreviousHeader = TargetBlock->PreviousHeader;
	TargetBlock->Length = 0;
}

After the PMM initialisation procedure is completed, the task subsystem is initialised. Note that the implementation of the task subsystem is highly dependent on PMM since most of its operations require dynamic memory allocation. The following is the code of Task::Initialise (from Task.cpp):

Void Task::Initialise()
{
	_NextIdAssignment = 0;
	SOID.Task_Count = 0;
	SOID.Task_FirstContext = SOID.Task_LastContext = Null;
}

The routine initialises all variables essential for task mamagement. The first line initialises the _NextIdAssignment variable used for specifying the identifier value to be used for next task allocation and the next two lines initialise the task count and pointers to the first and last task contexts in the SOID.

To be continued in Operating System IX (Pt. 3, Kernel and Drivers)

Operating System IX (Pt. 1, Initialiser)

Project IX was a hobby operating system project that I conducted some time in 2008-2009. The main goal of the project was to implement a multitasking x86-64 kernel with some essential features to be considered as a core operating system.

The project was quite a big deal to me at the time that I was working on it because the concept of hobby operating system development on x86-64 platform was yet unexplored and considered rather mysterious at the time (that is, not many hobby OS developers knew much about the x86-64 architecture).

The following is a screenshot of the operating system running two simultaneous tasks in QEmu environment (on a side note, the kernel ran just fine on VMWare as well as the real physical x86-64 systems by AMD and Intel):

The first character from the top right-most side is the time indicator (updated by the interrupt service routine of the Programmable Interval Timer unit interrupt), and the second character is the keyboard indicator (updated by the ISR of the Keyboard Controller interrupt and set to the scan code value- yes, keyboard scan codes do not numerically correspond to ASCII codes). The two characters floating somewhere in the middle of the screen are the time indicators for the two simultaneously running tasks (time-slice multitasking since I did not write the symmetric multiprocessing support code at that time). Although it may look very basic on its screen output, it actually took quite a considerable amount of  effort to get it running on all x86-64 systems.

The main problem with the subject of OS kernel development on x86-64 platform is that, regardless of the existence of the PC standard, all systems implement their hardware and firmware base in slightly different ways such that hard-coded implementation methods for one system usually result in fatal error in other systems.

The OSIX consists of the following two modules:

  • Initialiser: initialises the system environment for kernel execution.
  • Kernel: handles all base system operations including memory management and scheduling.

Because one of the kernel development goals was to support multiple execution environments, it was essential that the kernel supports various kinds of boot medium. However, this requirement also implied that it was necessary to implement an entire set of custom boot loader base consisting of a number of different boot device drivers and file system drivers for this development (which was highly impractical for the purpose of development). Due to this reason, the initial boot step (loading initialiser and kernel images from the boot medium) is performed by the well-known GRUB (GNU GRand Unified Bootloader).

The following is a screenshot of GRUB displaying Operating System IX load configuration:

In a rather interesting way, the initialiser image (init.x) is loaded as kernel by the GRUB and the actual kernel image is loaded as a module. The reason is simple: GRUB jumps to the kernel-NOT module- image after the load process is complete.

Once all image files are loaded by GRUB and the processor control is transferred to the Initialiser entry point, the following operations are performed (excerpts from Operating System IX, Initialiser Procedure #000002 Rev 1.):


1. Overall Procedure

  • Save the address of the multi-boot header.
  • [Sub-Procedure:IsLMSupported] Check if the Long Mode is supported by the processor.
  • [Sub-Procedure: IsKMLoaded] Check if the kernel image is loaded as a module.
  • [Sub-Procedure: RelocKernel] Relocate the kernel image to the expected load address.
  • [Sub-Procedure: InitPaging] Initialise the paging system.
  • [Sub-Procedure: InitSOID] Initialise the System Operating Information Descriptor (SOID).
  • [Sub-Procedure: InitPPAT] Initialise the Physical Page Allocation Table (PPAT).
  • Enable the 64-bit mode.
  • Jump to the kernel.

2. Sub-Procedure: IsLMSupported

  • Return the 29th bit of the CPUID at EAX 080000001.

3. Sub-Procedure: IsKMLoaded

  • Return the value of the 3rd bit of the multi-boot header flags field.

4. Sub-Procedure: RelocKernel

  • Move the kernel image at the multi-boot compatible boot loader provided module address to the expected address.

5. Sub-Procedure: InitPaging

  • Calculate the 2-MB aligned maximum possible physical address.
  • Calculate the number of the required paging structures to map the whole physical address area into the virtual address area.
  • Calculate the base address of each table.
  • Map the physical pages into the page tables.
  • Enable the Physical Address Extension (PAE) paging mode.
  • Initialise the CR3 PML4 table base address.

6. Sub-Procedure: InitSOID

  • Initialise the SOID with the initialiser-provided basic system information.

7. Sub-Procedure: InitPPAT

  • Initialise all PPAT entries with the unallocated value.
  • Mark all the reserved areas unallocatable based on the multi-boot information.
  • Mark all pre-mapped reserved areas unallocatable.

 

The following is the memory map of the IX system (from Operating System IX, Physical Memory Map #000003 Rev 0):


1. Factors

  1. Global Descriptor Table (GDT) consists of 32 descriptors (each descriptor is 16-byte.).
  2. Interrupt Descriptor Table (IDT) consists of 256 descriptors (each descriptor is 16-byte.).
  3. Task State Segment (TSS) total 104-byte
  4. System Operating Information Descriptor (SOID) 8-KiB
  5. Initialiser Image 1-MiB
  6. Kernel Image 2-MiB
  7. Kernel Process Default Memory Pool 8-MiB
  8. Page Tables varies
  9. Physical Page Allocation Table (PPAT) varies

2. Considered Exceptions

  1. The ISA Memory Hole at 15 MiB-16 MiB was not considered because the modern PC systems do not use the ISA buses.
  2. There may be more memory holes in the middle of the address space. In this physical memory map, the extra non-standard holes are not considered, but the Physical Page Manager will automatically block them according to the BIOS information.

3. Physical Memory Map

**NOTE: The address ranges and sizes of the objects specified may differ from the actual values used in the implementation.

Range Size Description
0000000000000000-00000000000001FF 16 * 32 = 512 bytes Global Descriptor Table (GDT)
0000000000000200-0000000000000267 104 bytes Task State Segment (TSS)
0000000000000268-00000000000003FF 408 bytes Free Space
0000000000000400-00000000000004FF 256 bytes BIOS Data Area (BDA)
0000000000000500-00000000000014FF 16 * 256 = 4096 bytes Interrupt Descriptor Table (IDT)
0000000000001500-0000000000001FFF 2816 bytes Free Space
0000000000002000-0000000000003FFF 8192 bytes System Operating Information Descriptor (SOID)
0000000000004000-000000000009FBFF 637952 bytes Free Space
000000000009FC00-000000000009FFFF 1024 bytes Extended BIOS Data Area (EBDA)
00000000000A0000-00000000000FFFFF 393096 bytes Memory Mapped Hardware I/O Area [ROM Area]
0000000000100000-00000000002FFFFF 2097152 bytes Kernel Image
0000000000200000-00000000002FFFFF 1048576 bytes Initialiser Image
0000000000200000-00000000009FFFFF 8388608 bytes Kernel Process Default Memory Pool
0000000000A00000-Var1 varies Page Tables
Var1-Var2 varies Physical Page Allocation Table (PPAT)
Var2-00000000FEBFFFFF varies Free Space
00000000FEC00000-00000000FFFFFFFF varies System Reserved Area (BIOS, PnP NVRAM, ACPI, APIC, etc.)
0000000100000000-FFFFFFFFFFFFFFFF varies (on physical memory size) Free Space

Var1 = varies on the size of the page tables required to map the whole Physical Address Area into the Kernel Virtual Address Area.
Var2 = varies on the size of the PPAT.


Since the long English version of the procedure description is not very fun to read, lets take a look at the actual code of the implementation. The following is the GDT structure used by the Initialiser:

GDT:
    dw 0x0000, 0x0000, 0x0000, 0x0000
    dw 0xFFFF, 0x0000, 0x9200, 0x008F ; Data (Ring 0)
    dw 0x0000, 0x0000, 0xF200, 0x008F ; Data (Ring 3)
    dw 0xFFFF, 0x0000, 0x9A00, 0x00CF ; 32-bit Code Ring 0
    dw 0x0000, 0x0000, 0x9A00, 0x0020 ; 64-bit Code Ring 0
    dw 0x0000, 0x0000, 0xFA00, 0x0020 ; 64-bit Code Ring 3
    ; 64-bit TSS Descriptor
    TSSLimit	equ 0xFF
    TSSBase		equ 0xFF9000
    dw TSSLimit & 0xFFFF, TSSBase & 0xFFFF
    db (TSSBase & 0xFF0000) >> 16, 0x89
    db ((TSSLimit & 0xF0000) >> 16), (TSSBase & 0xFF000000) >> 24
    dd (TSSBase & 0xFFFFFFFF00000000) >> 32, 0

As you can see from the structure above, there are 6 main entries in the Global Descriptor Table (GDT): Data R0/R3, 32-bit Code R0, 64-bit Code R0/R3, and TSS. Since it is required to provide an access protection to the kernel resources from user tasks, the code and data segments for R0 and R3 are separately specified and indexed (any access to descriptor index with DPL higher than the current executing RPL will result in processor-level fault handled by one of the exception ISRs eventually leading to the termination of the executing task). In addition to the descriptors required by the protection mechanisms, there is a TSS descriptor which specifies the address of the TSS (task state segment). Unlike the name of the descriptor itself, in x86-64 architecture extension, the actual TSS does not specify any task state information; instead, system-wide interrupt operating information fields (will be further discussed in this article).

Now, lets discuss about the entry procedure implementation of the Initialiser. The internal Initialiser process consists of the following three major steps:

  1. Entry.asm::Entry: reloads the GDT (abandons the previous GDT, also note that Entry procedure is executed in Protected Mode entered by GRUB) and executes a far jump operation to InitialisedEntry procedure with the 32-bit R0 code segment index specified by the new GDT.
  2. Entry.asm::InitialisedEntry: initialises all segment and stack pointer registers for the new GDT and jumps to Main
  3. Main.cpp::Main: performs the rest of the initialisation operation

Everything is very clear so far until the beginning of the Main procedure. What does Main procedure exactly do? Lets take a look at the first few lines of the Main procedure:

extern "C" Void Main()
{
	// Obtain the MultiBoot Information
	_MultiBoot_BootInformation *MultiBootInformation;
	asm("" : "=b"(MultiBootInformation));
...

Yes, the first thing it does is to obtain the address of the multiboot information structure from GRUB. Note that the address of the structure is parameterised in EBX register (no code in Entry.asm modifies the EBX register and the value passed from GRUB is preserved). Well, either way, this is not very interesting Lets see what we have in the next few lines:

	// Check if the CPU supports Long Mode
	UInt32 LMCheck_CPUID_EDX;
	asm("mov eax, 0x80000001; cpuid;" : "=d"(LMCheck_CPUID_EDX));
	if (!(LMCheck_CPUID_EDX & (1 << 29)))
	{
		Print("Error: CPU does not support Long Mode.");
		for (; ; ) ;
	}

This is the sub-procedure described earlier in the Initialiser Procedure document: IsLMSupported. The routine simply verifies if the 29th bit of the EDX register of CPUID at 080000001 is set or not. This bit indicates whether the executing processor supports Long Mode or not; ultimately clarifying whether it can execute 64-bit. If the bit is not set, the routine prints an error message and enters the famous infinite loop (remember, we are writing our own kernel; there is no other base system that can babysit us).

	// Relocate the Kernel
	if (!(MultiBootInformation->Flags & (1 << 3)))
	{
		Print("Error: Kernel Image was not loaded (Multiboot specification, Module 0).");
		for (; ; ) ;
	}
	UInt8 *Dest = (UInt8 *)0x200000;
	UInt8 *Source = (UInt8 *)MultiBootInformation->ModulesTable[0].AddressStart;
	UInt64 Size = MultiBootInformation->ModulesTable[0].AddressEnd - MultiBootInformation->ModulesTable[0].AddressStart + 1;
	if (Dest <= Source)
		while (Size--)
			*Dest++ = *Source++;
	else
	{
		Source += Size - 1;
		Dest += Size - 1;
		while (Size--)
			*Dest-- = *Source--;
	}

This bit of code is a bit more interesting. This is the routine for relocating the kernel module loaded by GRUB to the correct expected kernel loading address (this is necessary because all absolute pointer references in the kernel code refer to the addresses from the base address specified at the compilation time). The routine simply extracts the base address and size of the kernel module image from the multiboot header and copies the data at the obtained address to the expected kernel loading address.

	// Construct the Paging Structures
	// - Calculate number of the structures
	UInt64 MaximumPhysicalAddress = ((UInt64)MultiBootInformation->MemoryLower + (UInt64)MultiBootInformation->MemoryUpper) * 1024;
	MaximumPhysicalAddress -= MaximumPhysicalAddress % 4096;
	if (MaximumPhysicalAddress < ((UInt64)1024 * 1024 * 1024 * 4)) // At least 4 GB needs to be mapped in order to implement the memory mapped I/O
		MaximumPhysicalAddress = (UInt64)1024 * 1024 * 1024 * 4;
	UInt64 NumberOfPDPs = (MaximumPhysicalAddress / 0x80000000000ULL) + ((MaximumPhysicalAddress % 0x80000000000ULL) ? 1 : 0);
	UInt64 NumberOfPDs = (MaximumPhysicalAddress / 0x40000000ULL) + ((MaximumPhysicalAddress % 0x40000000ULL) ? 1 : 0);
	UInt64 NumberOfPTs = (MaximumPhysicalAddress / 0x200000ULL) + ((MaximumPhysicalAddress % 0x200000ULL) ? 1 : 0);
	UInt64 NumberOfPages = (MaximumPhysicalAddress / 0x1000ULL) + ((MaximumPhysicalAddress % 0x1000ULL) ? 1 : 0);
	// - PML4
	UInt64 *PML4Entry = (UInt64 *)0x1000000;
	for (UInt64 i = 0; i < NumberOfPDPs; i++) // Low Canonical Address Area
		PML4Entry[i] = 0x1001000 + (i * 0x1000) + 7;
	for (UInt64 i = 256; i < NumberOfPDPs + 256; i++) // High Canonical Address Area
		PML4Entry[i] = 0x1001000 + ((i - 256) * 0x1000) + 7;
	// - PDPs
	UInt64 *PDPEntry = (UInt64 *)0x1001000;
	for (UInt64 i = 0; i < NumberOfPDs; i++)
		PDPEntry[i] = 0x1001000 + (NumberOfPDPs * 0x1000) + (i * 0x1000) + 7;
	// - PDs
	UInt64 *PDEntry = (UInt64 *)(0x1001000 + (NumberOfPDPs * 0x1000));
	for (UInt64 i = 0; i < NumberOfPTs; i++)
		PDEntry[i] = 0x1001000 + (NumberOfPDPs * 0x1000) + (NumberOfPDs * 0x1000) + (i * 0x1000) + 7;
	// - PTs
	UInt64 *PTEntry = (UInt64 *)(0x1001000 + (NumberOfPDPs * 0x1000) + (NumberOfPDs * 0x1000));
	for (UInt64 i = 0; i < NumberOfPages; i++)
		PTEntry[i] = i * 0x1000 + 7;
	asm(" \
	mov eax, cr4; \
	or eax, (1 << 5); \
	mov cr4, eax; \
	mov eax, 0x1000000; \
	mov cr3, eax; \
	");

This is possibly the most interesting routine in the Initialiser code: paging table structure construction. In order to fully understand this code, you have to read the paging section of the system programming manual from either AMD or Intel; however, for the sake of simplicity, I will briefly explain what the code does. The routine calculates the number of each level of page tables required to map the entire phyiscal memory space and performs page table initialisation with the pointers of other level page tables or physical address. Note that there are sections for mapping high-canonical and low-canonical address spaces used for kernel and user address space separation (this concept is somewhat similar to 32-bit Windows kernel memory structure; yet quite different in many ways).

After this routine, the process is quite straight-forward. All sub-modes required for 64-bit system mode are enabled and the processor control is transferred to the kernel entry routine:

	// Enter Long Mode
	asm(" \
	mov ecx, 0xC0000080; \
	rdmsr; \
	or eax, (1 << 8); \
	wrmsr; \
	");
	// Enable Paging
	asm(" \
	mov eax, cr0; \
	or eax, (1 << 31); \
	mov cr0, eax; \
	");
	// Jump to the Kernel
	asm("mov ebx, dword ptr [%0];" : : "m"(MultiBootInformation));
	asm(" \
	.byte 0xEA; .int Alpha64Entry; .byte 0x20, 0x00; \
	Alpha64Entry:; \
	.byte 0x48, 0xB8, 0x00, 0x00, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00; \
	.byte 0xFF, 0xE0; \
	");

 

Click here to continue reading Operating System IX (Pt. 2, Kernel)

PowerOne Aurora PVI Inverter Communication

PowerOne Aurora PVI series (and other compatible models of wind type) inverters utilise a proprietary communication protocol for real-time operating data acquisition over the standard RS-485 bus (3-wire D+/D-/GND). Implementing a communication library for devices with such a protocol is a real pain if you do not have a proper protocol specification document (which, in this case, was never made available to public by PowerOne.)

In this article, to reduce the amount of your pain (especially for those who were instructed by their company director to implement one like this in a few days), I will provide a brief explanation and sample code for the communication operation.

[ Packet Structure ]
The Aurora Inverter communication protocol is a checksum-specified fixed-length packet protocol. Each packet consist of fields containing the number of the target device, operation type, checksum and various fixed-length data fields.

The following is the C-style pseudo code definition of the RTU-Inverter (Request) packet format:

struct RTUPacket
{
	byte address;                       // 8-bit Unsigned
	byte operationType;                 // 8-bit Unsigned
	byte data[6];                       // 6-byte Variable-type (depending on operation type)
	ushort crc;                         // 16-bit Unsigned (all non-string data types are little endian)
}

All packets sent from RTU to inverter follow the format specified above. As briefly explained in the comments, the content of data[6] field varies with the operation type; however, for all RTU packets, it is almost safe to assume that it will always be either empty (filled with zero) or data[0] filled with an extended operation type byte (will be further discussed with the operation types).

The following are the known (and most commonly used) operation types for RTU packets:

enum OperationType
{
    GetState = 50,
    GetProductNumber = 52,
    GetSerialNumber = 63,
    GetDSP = 59,
    GetAccumulatedData = 78
}

As I mentioned before, the data[6] field of the packet varies with the operation type. To be more exact, for any operations other than GetDSP and GetAccumulatedData will have an empty data[6] field (filled with zero); whereas, for the two operation types mentioned, the data[0] will be filled with an extra operation type.

The following are the known extra operation types for GetDSP operation:

enum DspDataType
{
    Riso = 30,
    IleakDcDc = 6,
    IleakInverter = 7,
    TempInverter = 21,
    TempBooster = 22,
    Dc1Voltage = 23,
    Dc1Current = 25,
    Dc2Voltage = 26,
    Dc2Current = 27,
    AcPhase1Voltage = 61,
    AcPhase1Current = 39,
    AcPhase1Frequency = 42,
    AcPhase2Voltage = 62,
    AcPhase2Current = 40,
    AcPhase2Frequency = 43,
    AcPhase3Voltage = 63,
    AcPhase3Current = 41,
    AcPhase3Frequency = 44,
    AcTotalPower = 3
}

 

The following are the known extra operation types for GetAccumulatedData operation:

enum AccumulatedDataType
{
    Daily = 0,
    Weekly = 1,
    Monthly = 2,
    Yearly = 3,
    Total = 5,
    Partial = 6
}

The response of the GetDSP and GetAccumulatedData packets varies with the specified extra operation type value (and they probably are the most valuable operations after all due to that fact). The calculation algorithm of the CRC field will be discussed after the brief overview of the response packet.

The packets sent from inverter to RTU, similarily, have a fixed size; however, the size of the packet is 8 bytes rather than 10 bytes. The following is the definition of the Inverter-RTU (Response) packet format:

struct InverterPacket
{
	UInt8 data[6]
	UInt16 crc;
}

Note that the packet structure is the same as the Request packet disregarding the first two bytes (yes, address and operationTypes fields do not exist in Response packets). What does this imply? All Request and Response packets have to be in series and no additional Request packet may be sent until a Response packet is received (or timed-out).

Obviously, the content of the data[6] field in Response packets vary with the operation type specified in the corresponding Request packet. The following are the excerpts from my implementation of the communication class for decoding the response packet:

private bool GetState(byte address, ref byte globalState, ref byte inverterState, ref byte array1State, ref byte array2State, ref byte alarmState)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, OperationType.GetState) == false)
        return false;
    else
    {
        globalState = buffer[1];
        inverterState = buffer[2];
        array1State = buffer[3];
        array2State = buffer[4];
        alarmState = buffer[5];
        return true;
    }
}

private bool GetProductNumber(byte address, ref string data)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, OperationType.GetProductNumber) == false)
        return false;
    else
    {
        data = Encoding.ASCII.GetString(buffer, 0, 6);
        return true;
    }
}

private bool GetSerialNumber(byte address, ref string data)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, OperationType.GetSerialNumber) == false)
        return false;
    else
    {
        data = Encoding.ASCII.GetString(buffer, 0, 6);
        return true;
    }
}

private bool GetDSPData(byte address, DspDataType dspDataType, ref int data)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, dspDataType) == false)
    {
        data = 0;
        return false;
    }
    else
    {
        byte[] rawData = buffer.Skip(2).Take(4).ToArray();
        MatchEndian(ref rawData);
        data = BitConverter.ToInt32(rawData, 0);
        return true;
    }
}

private bool GetDSPData(byte address, DspDataType dspDataType, ref float data)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, dspDataType) == false)
    {
        data = 0.0f;
        return false;
    }
    else
    {
        byte[] rawData = buffer.Skip(2).Take(4).ToArray();
        MatchEndian(ref rawData);
        unsafe // NOTE: BitConverter.ToSingle function causes a system-wide failure on Advantech x86 ECs
        {
            float ret;
            byte* retPtr = (byte*)&ret;
            retPtr[0] = rawData[0];
            retPtr[1] = rawData[1];
            retPtr[2] = rawData[2];
            retPtr[3] = rawData[3];
            data = ret;
        }
        return true;
    }
}

private bool GetAccumulatedData(byte address, AccumulatedDataType accumulatedDataType, ref int data)
{
    byte[] buffer = new byte[8];
    if (Communicate(buffer, address, accumulatedDataType) == false)
    {
        data = 0;
        return false;
    }
    else
    {
        byte[] rawData = buffer.Skip(2).Take(4).ToArray();
        MatchEndian(ref rawData);
        data = BitConverter.ToInt32(rawData, 0);
        return true;
    }
}

You may observe from the code above that I defined a separate function for endian-matching and even unsafe code for data type adoptation (in C#). This was to ensure that the communication library code is able to run on multiple platforms including, but not limited to, ARM/x86/Itanium/SH4.

 

[ Procedures ]
As you may already have observed from the code excerpt of my Get*** methods, all operations begin with a Request packet from RTU to inverter and end with decoding the Response packet received from the inverter. The core of performing this procedure is the Communicate method.

The following is my implementation of the Communicate method:

private bool Communicate(byte[] buffer, byte address, byte param0, byte param1, byte param2, byte param3, byte param4, byte param5, byte param6)
{
    try
    {
        ushort crc;
        bool isSucc = false;
        // Compose the request packet
        byte[] sendBuffer = new byte[10];
        sendBuffer[0] = address;
        sendBuffer[1] = param0;
        sendBuffer[2] = param1;
        sendBuffer[3] = param2;
        sendBuffer[4] = param3;
        sendBuffer[5] = param4;
        sendBuffer[6] = param5;
        sendBuffer[7] = param6;
        crc = Crc16.Compute(sendBuffer, 0, 8);
        sendBuffer[8] = (byte)crc;
        sendBuffer[9] = (byte)(crc >> 8);
        // Attempt to send and receive
        byte[] receiveBuffer = new byte[8];
        for (int i = 0; i < MaxAttempt; i++)
        {
            // Discard all buffers
            _commPort.DiscardOutBuffer();
            _commPort.DiscardInBuffer();
            // Send the request packet
            _commPort.Write(sendBuffer, 0, sendBuffer.Length);
            // Wait for the inverter response
            Thread.Sleep(ReadWaitDelay);
            // Verify and read the received packet
            // - Length
            try
            {
                if (_commPort.Read(receiveBuffer, 0, receiveBuffer.Length) != 8)
                    continue;
            }
            catch { continue; }
            // - CRC
            crc = (ushort)(receiveBuffer[6] | receiveBuffer[7] << 8);
            if (crc != Crc16.Compute(receiveBuffer, 0, 6))
                continue;
            // Copy the received packet into the buffer
            receiveBuffer.CopyTo(buffer, 0);
            isSucc = true;
            break;
        }
        return isSucc;
    }
    catch { return false; }
}

In short, the Communicate method sends a Request packet and expects/reads a Response packet. However, as you can see, there is a bit more to the implementation than simply sending and reading. First of all, all read packets are checked for integrity using the CRC value that it contains. This is an essential process because the error rate on RS-485 bus can be very high depending on the line characteristics (I have once even observed a line with CRC error rate up to 35% of 10000 trials.) It is also to be noted that the Read operation has a time-out configuration and will fail if no packet of the right length is received in the specified time. It is quite common for the sent packets to be lost (no response is received) and a time-out has to be set to continue on with loss packets. In these cases, the Communicate method attempts the same procedure for MaxAttempt number of trials.

Now finally regarding the CRC calculation algorithm, the algorithm is CRC-16 with 08408 polynomial and initial 0xFFFF. The following is my C# implementation of the algorithm:

static class Crc16
{
    public static ushort Compute(byte[] data, int offset, int count)
    {
        ushort polynomial = 0x8408;
        ushort crc = 0xFFFF;
        if (count == 0)
            return 0;
        for (int i = 0; i < count; i++)
        {
            byte current = (byte)data[offset + i];
            for (int j = 0; j < 8; j++, current >>= 1)
            {
                if (((crc ^ current) & 0x1) > 0)
                    crc = (ushort)((crc >> 1) ^ polynomial);
                else
                    crc >>= 1;
            }
        }
        return (ushort)(~crc);
    }
}

If you have any additional questions regarding the implementation of an Aurora Inverter communication library, please leave a comment (do not send a private e-mail question as this will not help building the Q&A database on this topic.)

Thank you for reading and good luck implementing your own :)

 

p.s. Some models of Aurora Inverters have a tendency to become randomly unresponsive mid-day until the night-mode reset for the next day. If you see one or two (or even more depending on situation) inverters becoming unresponsive while others are still all responsive, that is not your fault. Contact PowerOne regarding this matter and they will perform an on-site communication module upgrade (however, there may be a significant amount of loss of profit depending on your site scale and the business-side of your company might go I aint avin it).

Electric Bicycle Project

I have recently purchased a Schwinn model electric bicycle from an elderly gentleman who had no use for it and decided to do quite an interesting project with it.

The first general idea of the project was to simply redesign the motor controller unit for the bicycle so that I can get more miles out of it. However, having dreamt of working with electric bicycle technologies since a few years ago, I knew that I wouldnt be satisfied with it and decided to take this project a lot further to produce an entire set of new custom control system.

The current project objectives are the followings:

  • Efficient PWM-based DC motor speed control
  • Regenerative breaking utilising massive (700 FARAD) ultracapacitor bank
  • Energy transfer between ultracapacitor bank and battery bank
  • DC boost (or step-down depending on operation mode) system for battery bank voltage independent speed control and inter-bank energy transfer
  • Intelligent system monitoring and control with MCU

The image above is the simplified system block diagram of the current design concept. At this stage of development, there are many other factors to be considered and a significant amount of changes will be eventually made to this diagram to accommodate the implementation requirements for the design (I apologise for not having a proper scanner device).

At this moment, I am focusing on the physical factors of the design (e.g. mounting of the battery and capacitor banks to the bicycle frame, electrical wiring) and expecting to do some cutting and welding work for next few days. Unfortunately, I do not have any CAD drawings for the mounting frames- I am doing it in the yold quick-and-dirty way :( but I will eventually post the picture of the finished product for anyone who may be interested.

On a side note, I decided not to use standard battery bags for various reasons- 1. the additional bank that I am mounting consists of two 20-Ah 12-V batteries weighing up to 14-kg total, 2. I wanted a more permanent solid design, 3. There is no adequate place where I can mount the standard battery bags on this bicycle frame.

Stay tuned for more updates on the project :)

Floppy Drive Organ: Phase 1 Demo and Technical Details

This is a demo video and technical documentation of the Floppy Drive Organ Phase 1 project from 2011.

The Phase 1 system consists of two different control devices:

  • FDMP-MCS: Frequency/interval data sequencer; transmits the note frequency and duration interval over RS-232 to the floppy drive control machine.
  • FDMP: Floppy drive controller; signals the floppy drive through the system parallel port to actuate its stepping motor.

 

[ FDMP (Floppy Drive Music Player) Operation ]

The basis of the operation of a floppy drive organ is to pulse the floppy drive stepping motor at the frequency of a desired musical note. This controlled pulsing behaviour to the motor, due to the noisy nature of floppy drive stepping motors, produces an audible amount of noise that can be heared as music- quite simple and dirty, isnt it?

So how do we pulse the floppy drive stepping motor?

The following is the pinout of a standard (now obsolete) 3.5 floppy drive.

You may have already noticed from the list above that there is a pin called /STEP (#20). This is the pin to be pulsed at the frequency of the desired note. Also note that there are pins [Drive Sel A/B]. Before pulsing the /STEP pin, /DRVS[A/B] (depending on your drive configuration) must be set to high logic; otherwise, the motor wont respond at all.

Now you have a general idea of how this works, lets talk about the details. The actual implementation of this is, in fact, a bit more complicated than what was described above. The most important part of implementation is floppy drive-controller interfacing.

The most common method for implementing this pulsing mechanism is to use a microcontroller. You can purchase a cheap microcontroller board from eBay and program it to pulse one of its DIO pins at a certain frequency (and of course, the DIO pin must be connected to the /STEP pin of the floppy drive). Although this may be the most viable method for most people, I decided to try something a bit different.

My method of implementation was to use a regular PC parallel port. Since the logic level (0/+5V) of the PC parallel port and the floppy drive interface are compatible, a parallel port can be directly connected to the floppy drive interface for pulsing.

One problem that was encountered while implementing this method was producing a stable pulse. Due to the time-sharing multitasking capability of the modern operating systems, it was unable to guarantee that the parallel port output operation will be executed at a perfectly stable rate (required to produce a musical note); resulting in the floppy drive producing some sequence of funky noise rather than musical notes.

After a bit of playing around, it was made clear that a non-multitasking operating system needs to be used for controlling the parallel port (and thereby the floppy drive connected to it)- in remembrance of the gold days, I decided to use MS-DOS for the development platform.

First of all, to ensure that the pulsing application will operate stably, all TSRs were cleared and the IVT (interrupt vector table) was reinitialised to service only essential system/hardware interrupts. The application must not be interfered while carrying out the parallel port output operation as any more than tens of microseconds of delay will affect the frequency produced (notice that period = 1 / frequency, where 0 < f < 20000).

Since I wanted to control the floppy drive from my main PC, I decided to interface the DOS PC running the floppy drive controller application to my Windows PC running the sequencer application over RS-232 serial port. Essentially, what the sequencer application does is to send the notes (each containing frequency/interval data fields) to be played to the floppy drive controller application note queue. This enables a multitasking PC to be (indirectly) able to pulse the floppy drive at a stable frequency.

 

The following is the video of actual operation:

 

I am planning to expand this project to multiple floppy drives (including some 5.25 drives from the ancient times) using a microcontroller or a standard PC parallel port coupled with some 7400-series ICs.

Stay tuned for it :)