COM PORT OVERRUN ERRORS:

If your UART is of an older type (8250 or 16450), it can only hold eight bits of data from the modem (one byte). Most any CPU running MSDOS can service interrupts quickly enough to empty a one-byte UART without overruns, even if the modem is allowed to fill it at a 115,200 bps rate. However, the extra overhead imposed by Windows generally limits the 8250-interrupt servicing speed of even the fastest CPU to a rate which can only cope with a modem-to-com-port rate of 9600 bps (19200 with only one program window open).

Since modem data compression (V.42bis or MNP-5) can send 2-to-4 bytes of compressible data (like text) for each byte actually transmitted over the modem, a modem receiving data at a nominal 28,800 bps rate can fill a com port's UART at 57,600-to-115,200 bps. The solution to this problem is a UART that can hold up to 16 bytes of data in a First-In/First-Out (FIFO) data buffer: the 16550A.

The MicroSoft Diagnostic program (msd.exe, which can only be run from the DOS prompt after exiting Windows) will tell you whether it detects a 16550 on your com port. Any modern combination disk controller/serial port card can provide one for about US$30. Any internal modem will emulate a 16550 in high-speed firmware. Alternatively, a parallel port modem can connect to your printer port with a software driver that emulates a 16550-based com port at speeds up to 300,000 bps.

In any case, Windows must talk to the UART through a serial port driver (called comm.drv) which can make full use of the 16550A's FIFO. (A dynamically-linked library of functions for an applications program is a .dll-file; one that is loaded by Windows from the windows\system directory at start-up, so it's ready to serve any compatible applications program, is a .drv-file.) Windows uses an entry in the [386enh] section of the system.ini file to enable/disable use of the FIFO - it must say Com#FIFO=1, where # is the number of the com port (1,2, or whatever).

NOTE: Com#FIFO=2 (or any numerical value except 0 or 1) is the default value, which tells Windows to test for a 16550, and if it's there use its FIFO. However, Com#FIFO=1 eliminates any ambiguity about the reliability of such a test with an emulated FIFO. Com#FIFO= anything else, such as "true" or "on" will be interpreted as Com#FIFO=0, which turns off the FIFO. [Microsoft KnowledgeBase Article Q119579]

The default setting of Com#RxTrigger=8 tells the com port to interrupt Windows to come get data when 8 of the 16 bytes in the FIFO are full. To do so, Windows must remember what it was doing, doing a context switch,empty the data, do a context switch back again, and resume what it was doing at the time of the interrupt. The time it takes to come get the data must not exceed the time it will take the modem to fill the remainder of the FIFO with a serial bit stream at 10 bit-times per byte, or there will be an overrun.

At 115,200 bps, one bit-time is less than 9 microseconds, so the modem can serially load the FIFO with a new byte every 87 microseconds. At a 57,600 bps rate, the modem won't load the FIFO any faster than one byte every 175 microseconds. This is why the first line of defense from overruns is to reduce the rate setting your com port advertises to your modem.

NOTE: Any well-written communications program sets the com port it uses with the byte format and bit-rate it will need at program start-up. These settings wipe out any previous settings, such as those which Windows makes at Windows start-up with the com port settings in Control Panel. For this reason, the com port rate setting must be specified in the communications program. In the case of Trumpet WinSock, the rate setting is specified in the trumpwsk.ini file. It is written there by the setup screen, or can be edited directly with any ASCII text editor, like Notepad (as opposed to a word processor like Write, which will corrupt the file with formatting characters).

Putting Com#RxTrigger=4 in the [386enh] section of system.ini will allow 120 bit-times for Windows to respond. However, this will force Windows to do a pair of context switches every 40 bit-times, rather than only every 80 bit times - 50% more time to respond at the price of 100% more overhead. Really severe overrun problems can be addressed by setting Com#RxTrigger=1, but at the very severe cost of using a vast percentage of your machine's resources in context-switching, rather than updating your other program windows.

Windows 3.1 (and Windows 3.11, a cosmetic upgrade with the same com port capabilities) includes a comm.drv that has limited flexibility in utilizing a FIFO. It has a fixed setting of 14 for RxTrigger, allowing only 20 bit-times for responding to the IRQ before overruns occur. Among the replacements for the Win3.x comm.drv that allow tuning RxTrigger points under Windows 3.x is a "freeware" driver from CyberCom. To use it, place a copy of cybercom.drv in the windows\system directory, and in the [boot] section of system.ini change the line that says "comm.drv=comm.drv" to comm.drv=cybercom.drv, in order to tell Windows which driver to load at start-up.

NOTE: Some FAX program installers replace comm.drv with their own driver, which may or may not be an improvement for WinSock applications. Check the comm.drv= line in the [boot] section of system.ini to see what driver Windows is actually loading from the windows\system directory at start-up. It has been reoprted that the installer for WinFAX actually disables the FIFO by setting COMxFIFO=0 in the [386enh] section of system.ini - almost guaranteed to cause overruns.

Windows for Workgroups 3.11 was actually the test-bed for many of the features of Microsoft's Chicago project, which produced Windows95. Consequently, it has many high performance internal structures not found in Win3.1 or Win3.11. It uses a 32-bit virtual device driver (VxD) called VCOMM.386 to virtualize the com ports.

NOTE: A VxD is like a .dll for the operating system. It can talk directly to hardware in ring-0 "privileged" mode, unlike .dlls and application programs, which must operate in ring-3 "protected" mode.
This VxD talks to another, called serial.386, which actually talks to the FIFO. Application programs still issue their serial communications API function calls to a ring-3 dll called comm.drv. However, this small comm.drv is only there to forward their requests to VCOMM. The whole serial communications architecture of WFW3.11 is quite different from the "everything-in-a-ring3-dll" (comm.drv) serial communications architecture of Windows, and it yields much faster interrupt response performance.

NOTE: The original version of serial.386 contained a bug that caused the issuance of an extra NUL character to the com port when closing a session. This causes some integrated chipsets with a built-in 16550A-type UART cell to hang, requiring a system re-boot in order to open the com port for a second use. The corrected version of serial.386 has a file creation date of 2/17/94.

Neither the Windows comm.drv nor any of its replacements (like cybercom.drv) are designed to properly talk to VCOMM in Windows for Workgroups (unlike the WFW comm.drv). You should never replace the Windows for Workgroups comm.drv with anything. (Since it's only designed to talk to VCOMM, you also can't use the WFW comm.drv in Windows.)


Inadequate FIFO buffering is not the only cause of com overrun errors. Anything that prevents your CPU from responding quickly enough to interrupts from your UART can cause overruns.

One of the more insideous causes are S3-chip-based video cards which gain high-speed graphics performance by having their software driver steal interrupt cycles from your CPU in the background. This keeps your CPU too busy to respond to com port interrupts in time. The solution is an up-to-date driver, and using its option to turn-off this "speed-up" mode.

NOTE:These drivers seem to configure themselves from the [display] section of system.ini at load time. If your board vendor doesn't supply you with a driver installation utility that does so via option check-boxes, manually add a line to the [display] section of system.ini saying bus-throttle=on.

Another cause is poorly written 32-bit disk drivers that aren't WD1003-compatible (needed for Windows' caching software to work properly), and which lock-out lower priority interrupts (like com port interrupts) for an inordinately long time while they dump-to-disk a large write-behind cache. While awaiting longer term fixes by upgrading disk/drivers/BIOS, you can get temporary relief by turning-off write-behind caching.

NOTE: Windows uses a Terminate-and-Stay-Resident (TSR) program for disk-caching called smartdrv which is loaded by your autoexec.bat file. Add the switch /X to turn-off write-behind caching. Windows for Workgroups uses a VxD called VCACHE, ignoring smartdrv except for floppy disk drives. Write-behind caching for VCACHE is turned-off with a line in the [386enh] section of system.ini that says ForceLazyOff=C (or =CD if you have two hard drives) with no spaces and no : after drive letters.

A fully compatible disk driver (like Western Digital's WDCTRL.DRV for its Caviar drives, or Ontrack Software's Drive Rocket) will enable Windows for Workgroups to use both 32-bit file access (with a VxD called VFAT) and 32-bit disk access which bypasses the DOS disk interrupt services through a Digital Protected Mode Interface. This provides much faster disk reads and writes to allow more time for handling com port interrupts.

In general, any upgrade of a peripheral that results in the sudden appearance of com overruns should be debugged by commenting-out the lines in system.ini that load their accompanying new Windows drivers, or the lines in config.sys or autoexec.bat that load their driver TSRs, then rebooting with each one added in succession until the culprit is found. Brain-dead driver software written by people who assume exclusive rights to their customers' machines is an all-too-frequent cause of crippled system interrupt response speed, and com overruns.

This FAQ is available as a Windows Help® file for off-line viewing

Copyright© 1995 by Albert P. Belle Isle