|
THE ASSEMBLY LANGUAGE "MAGAZINE" VOL 1 NUMBER 4 December 1989
go to Italian version
translate
THE ASSEMBLY LANGUAGE "MAGAZINE" VOL 1 NUMBER 4
December 1989
## #### #### ####### ## ## ###### #### ## ##
#### ## ## ## ## ## # ### ### ## ## ## ## ##
#### ## ## ## #### ## ## ## #### #######
## #### ### ## ## ## ## ## #### ## ## ## #
## ## ## #### ## ## ## ## ## ## ## ## #
## ## ## ## #### ## ## ## ## ## ## ####
## # ###### ## ### ## ### ## ## ###### ## ### ## #
## ## ## ## ## ## ## ## ## ## ## ## ## ## ## #
####### ## ## ## ## ##### ###### ## ## ##### #######
Written by and for assembly language programmers.
Table of Contents
Editorial.......................................2
Policy and Guide Lines..........................3
Beginners' Corner...............................5
Structure, Speed and Size
By Thomas J. Keller........................7
Editorial Rebuttal........................11
Accessing the Command Line Arguments
By Thomas J. Keller.......................13
Original Vector Locator
by Rick Engle.............................15
How to call DOS from within a TSR
by David O'Riva...........................22
Environment Variable Processor
by David O'Riva...........................26
Program Reviews................................35
Multi-Edit ver 4.00a .....................35
SHEZ......................................36
4DOS......................................36
Book Reviews...................................37
Assembly Language Quick Reference
Reviewed by George A. Stanislav......37
GPFILT.ASM.....................................39
;Page 1
Editorial
It has been much too long since the last issue of the Magazine was
published. Much of this time was due to the lack of submissions but
there has been enough to assemble since early November. I hope that
it will not be as long till the next one is ready for distribution.
You can help make that possible by writing up and sending in an
article.
I'm trying out a new editor for this issue. That makes it four
editors for 4 issues. There is a review of it in the review section.
There is a continueing and probably insoluable problem in
formatting the 'Magazine'. The readability of the text portions is
enhanced with wider margins and is more easily bound with a wide
left margin. The difficulty arises when source code is included. 80
columns is little enough in which to fit the code and comments,
allowing nothing for margins. So this time we'll try a 5 space
margin on the left for the text portion. Further offset should
be done with your printer.
A couple of quick notes here as I don't know where else to put
them.
For the assembly programmer the principle difference in writing
for DOS4+ is that there is a possible disk structure using 32 bit
FAT entries. This of course has no effect as long as you use only
the DOS calls for disk access, but if you are going to do direct
disk editing this must be checked for.
The occasional ~ is for the use of my spelling checker.
;Page 2
Policy and Guide Lines
The Assembly Language 'Magazine' is edited by Patrick and David
O'Riva. We also operate the AsmLang and CFS BBS to distribute the
'Magazine' and to make available as much information as possible to
the assembly language programmer. On FidoNet the address is
1:143/37. Address:
2726 Hostetter Rd
San Jose, CA 95132
408-259-2223
Most Shareware mentioned is available on the AsmLang board if local
sources cannot be found
Name and address must be included with all articles and files.
Executable file size and percent of assembly code (when available)
should be included when a program is mentioned and is required from
an author or publisher. Any article of interest to Assembly
language programmers will be considered for inclusion. Quality of
writing will not be a factor, but I reserve the right to try and
correct spelling errors and minor mistakes in grammar, and to remove
sections.
Non-exclusive copyright must be given. No monetary
compensation will be made.
Outlines of projects that might be undertaken jointly are
welcome. For example: One person who is capable with hardware
needs support from a user friendly programmer and a math whiz.
Advertisements as such are not acceptable. Authors and
publishers wishing to contribute reviews of their own products will
be considered and included as space and time permit. These must
include executable file size, percent of assembly code and time
comparisons.
Your editor would like information on math libraries, and
reviews of such.
Articles must be submitted in pclone readable format or sent
E-mail.
Money: Your editor has none. Therefore no compensation can be
made for articles included. Subscription fees obviously don't
exist. Publication costs I expect to be nil (NUL). Small
contributions will be accepted to support the BBS where back issues
are available as well as files and programs mentioned in articles(if
PD or Shareware ONLY).
Shareware-- Many of the programs mentioned in the "Magazine"
are Shareware. Most of the readers are prospective authors of
programs that can be successfully marketed as Shareware. If you
make significant use of these programs the author is entitled to his
registration fee or donation. Please help Shareware to continue to
;Page 3
be a viable marketing method for all of us by urging everyone to
register and by helping to distribute quality programs.
;Page 4
Beginners' Corner
I finished up the last column by saying I would discuss more
techniques this time. I have entirely forgotten what they were. So
without dwelling on that we will just move on the means of getting
your program ready to run. The two formats (.com and .exe) are very
different and so will be discussed separately.
COM Programs
On Entry all of your segment registers are set to the same
value, that of the start of the PSP. Your stack pointer is set to
the top of the segment, and your instruction pointer is set to 100h.
You need to make a generous estimate of the maximum amount of stack
that your program can use (or count it exactly) Each level of Call
uses 2 bytes (for the address of the next instruction). An INT uses
6 bytes. (2 for the IP, 2 for the CS, and 2 for the Flags). Each push
of course uses 2. So if your subroutines can go 4 levels deep and
contain 7 pushes (without intervening pops) and the deepest contains
an INT21h, then you would need at least 28 bytes of stack. But
stack space is cheap, and you might need to change things. So use a
nice round number of 128 bytes. BIOS also uses YOUR stack in the
earlier versions of DOS, and the guideline for that is at least 128
bytes. Result: 256 bytes is safe for a modest program. To implement
this the following lines of code could be used at the start of the
program: ~
org 100h
jmp main
defstack db 32 dup('stack ')
stacktop label byte
;other data
main:
cli
mov sp,offset stacktop
sti
~
The db statement is 32 times the string of 8 characters
totaling 256 bytes. It could just all well be db 256, but it is
kind of nice when looking at it with a debugger to see the stack
area and how much has been used all nicely labeled. The cli and
sti aren't really necessary here because it is only one instruction,
but you are dealing with the stack, and it's well to remember that.
At the end of your program you need a label e.g.
~
progend label byte
Then following your stack adjustment above:
mov bx,offset progend
mov cl,4
shr bx,cl
inc bx
~
;Page 5
These instructions change the offset value into a number of
paragraphs (16 bytes) and to the end of the last paragraph. This is
the total number of paragraphs that will be occupied by your
program. Then it is necessary to inform DOS of this information:
~
mov ah,4ah
int 21h
~
4a is the DOS function to modify allocated memory. It needs the
new number of paragraphs in BX (which is where it was put)
At this point, your program is in an orderly condition. Your
data as well as that in the PSP is available with the DS and ES
registers, The stack is large enough and well mannered, and all
surplus memory is available to you or other programs.
;Page 6
Structure, Speed and Size
as
Elements of Programming Style
By Thomas J. Keller
P.O. Box 14069
Santa Rosa, CA, 95402
Let us examine the reasons for choosing to implement a given
program in assembly language as opposed to some high level language.
The reasons most commonly given are execution speed and memory image
size.
Execution speed, except in certain highly critical realtime
applications, or certain high resolution graphics applications, is
probably not a realistic reason to opt for assembly language. For
example, a good C compiler with optimization (which precludes use of
Turbo or Quick C) produces code which only suffers a 10-15% speed
penalty, over typical hand crafted assembly language code. It is
possible to write assembly language code which will run faster than
this, but few programmers have the requisite skills.
In most applications, a 10-15% speed penalty is simply
irrelevant. It is unlikely that the typical user would even notice
such a difference. In particular, programs which are highly
interactive, and thus spend far and away the greatest amount of time
waiting for user input are highly insensitive to such speed
penalties. Many people don't realize that even assuming that the
user is typing at a rate of 100 wpm (approximately 500
keystrokes/minute), the CPU is still spending the bulk of its time
idling, waiting for the next keystroke.
There are, of course, always exceptions to virtually any rule,
and there are most certainly exceptions to this rule. Word
processors, for example, while actually accepting text input, are
not speed critical. When performing global search and replace, or
spell checking, for example, even a 10% penalty can become expensive
on large documents. So there is a tradeoff to be made.
Assembly language programs cost considerably more than 10-15%
more to develop than high level programs. The minutiae involved in
managing a massive assembly language programming effort are
overwhelming. Assembly language programs take MUCH longer to
complete, in almost all cases, than high level programs do, a major
contributory factor in the overall cost of development. Finally,
projects developed in high level programming languages are much more
likely to be easily ported to platforms based on processors other
than the platform on which the project is developed, and very
important consideration for a major project. The ability to port a
project easily to other platforms increases the market for a
product, thereby not only increasing the profitability of the
product, but also helping to reduce the sale price of the product
(larger market generally translates to lower per unit cost).
;Page 7
So the vendor or developer must analyze the relative impact of a
small improvement in execution speed vs a large increase in
development time and cost, which consequently translates to higher
selling prices, thereby reducing the anticipated market for their
product. In many cases, the tradeoffs do not merit choosing
assembly language.
Let us turn now to binary image size (memory size). The
advantages of small programs are clear, when examining programs
which are, in the DOS world, TSRs (The MAC and AMIGA worlds have
similar cases, though I am not sufficiently familiar with them to
know what they are called). These programs are loaded into memory,
and remain there until explicitly removed, which means that the
memory they use is NOT available for other uses. Device drivers
similarly use memory, precluding its use for other programs, and
therefore also clearly benefit from small size. In the
multi-tasking world (DeskView or PC/MOS, in the PC clone market),
small executables also have an advantage, permitting more programs
to be run "simultaneously" in a given memory configuration, though
running multi-taskers in severely restricted memory configurations
probably qualifies as a technical error.
What of normal, single tasking, single user environments (such as
DOS, the MAC and AMIGA environments)? Besides the ego boost of
creating a very small, very tight utility or application, what benefit
is there in generating very small programs?
They take less disk space to store, but realistically, at least
under DOS, lots of very small utilities may actually not achieve a
significant savings in disk space, due to granularity of storage
allocation. They load a little faster, in most cases.
But once again, the economics of the issue comes back to haunt us.
It is not clear that the effort and expense of writing most
applications in assembly language due to size considerations is an
economically rational decision. The same economic pressures and
considerations apply as do to the execution speed issue discussed
above.
On to structure. I must take issue with Patrick O'Riva regarding
their purpose and nature of "structured programming." While much of
his definition is true, it is incomplete, and appears to reflect a
misunderstanding of certain aspects of the structured approach to
programming.
Firstly, it is entirely possible (and not altogether a rare
occurrence) to write thoroughly unstructured code in PASCAL or C. One
must take care to recognize the difference between references to a
"block structured" language, as PASCAL and C both are, and "structured
programming," which is really a totally separate issue.
Structured programming is an approach to programming that is
thoroughly applicable to whatever language a project is being
implemented in. It implies firstly a step-wise refinement approach
to defining the solution to a problem which the program is to address
(in other words, determining the nature of the desired goal, and an
at least rational approach to reaching said goal). Secondly, it
;Page 8
involves determining, to the extent possible, the nature and structure
of the data that is to be processed by the program. Finally, it
involves a top-down approach to the actual coding process.
Just what is a top-down approach? Essentially, this means that we
code the high level functionality of the program first, programming
simple "do nothing" stubs for the lower levels of the program. As
necessary to test the high level code, we implement lower level
functions, again, if needed, programming still lower level stubs.
Assuming that the structured design approach of step-wise refinement
was used to begin with, the actual coding should really amount to
translating the logic flow diagrams, or pseudo-code, or whatever means
of recording the refinement process was used, into actual program
code. In the ideal situation, the program almost literally codes
itself at this point.
There is a myth that "structured programming" means "goto-less"
programming. In fact, this is not the case. This myth came into
being through misunderstanding of the rather harsh criticism of the
"go to" which occurred in the computer science journals beginning in
approximate the mid to late sixties. This criticism was based
primarily upon the typically excessive use of the "go to" in FORTRAN
and BASIC programming at the time. Such indiscriminate use of "goto"
led to what has been called "spaghetti" code, code which is virtually
impossible to trace or analyze.
In fact, there are many cases in programming where the goto is most
structured solution available. Structured coding techniques are
intended to clarify and make easier the process of analysis, design
and implementation of computer programs, not to define rigid, strictly
enforced rules in the face of all reason.
Structured programming is ALWAYS the best approach to ANY computer
program. If the internal requirements of the program, as regards
speed or memory utilization, dictate the use of goto's, then use them.
A properly documented GOTO can be far more "structured" than an
undocumented string of modular function calls.
So, back to assembly language programming. When is it appropriate
to choose assembly language to implement a program? First, and most
obviously, when the speed or memory utilization requirements of the
application demand the capabilities that well crafted assembly
language offers. Second, perhaps not so obviously, when it is
necessary to work at the hardware level a great deal. High level
languages, even C, do not generally manipulate hardware registers
efficiently. So, if your program makes frequent or widespread use of
direct hardware manipulation, it is a likely candidate for assembly
language.
Finally, and probably the most gratifying reason of all to choose
assembly language, is when you want the satisfaction of having tackled
a project in assembly and pushed the bits around to suit your purpose.
There is little I can imagine that is more satisfying than to reach
down into the microprocessor chip and twiddle those bits. Just be
sure that you don't let your ego cloud your judgment, when the
economics of the project are important (e.g., when a project is to be
distributed commercially, or there is an urgent need for speedy
;Page 9
completion).
I believe that all PROGRAMMERS (as opposed to casual computer
users) should learn the assembly language for the machines on which
they work. Besides offering the flexibility of shifting to assembly
to meet a specific goal, learning assembly intimately familiarizes the
programmer with the hardware on which s/he is working. The more you
know about your hardware environment, the better off you are.
;Page 10
Editorial Rebuttal
I thank Mr. Keller very much for his article and agree with
many of the points he has made. However I must still argue the
points of size and speed and justification.
Whenever a program is user limited and will not be used in a
multi-tasking environment as is often the case with a word processor
and certain drawing programs, there may be little to be gained in
assembly programming. Also there are programs which are DOS limited
and little speed increase is possible.
Mr. Keller uses a figure of 10 to 15 percent speed penalty. My
experience indicates a value closer to 300 to 400 percent though
direct comparisons are difficult to make because the same programs
are usually not written in both assembly and in C. The size
difference seems to be a factor of 5 to 10. The two prime examples
I can offer are both by Microsoft, and it can be assumed they make
use of an optimizing compiler. Their assembler is approximately 110k
in size. A86 while not compatible in syntax has comparable features.
It's size is 22k and assembles code in about one eighth the time.
Microsoft's programmers' editor is vaguely 250k. Qedit is
about 50k and is a mix of high level and assembly. You can grow gray
hairs waiting for the MS editor to do a search and replace, but if
you blink you'll miss it with Qedit. A fully capable full screen
editor without the extras that make it a pleasure to use can easily
be written in less that 5k. Give another 5k for features. What has
MS gained with the extra 240k of code?
David has recently completed (though they are still adding
modules) a database and accounting program for a multi-office
company. A much abbreviated version was threatening to overflow their
384k limit. Investigation of a Dbase implementation indicated in
excess of 500k. Data base sorts used to take 10 hours. They now take
20 minutes. Savings in processing time and entry time plus increased
functionality suggest a savings of $5000 to $10,000 per month PER
OFFICE. Code size? 35k. Are they unhappy about the $15,000 they've
been charged for a program that will get lost in a single floppy
disk?
Given the above examples, I must maintain that the use of high
level language, when there is significant processing to be done, and
when it will be used on a regular and continuing basis, benefits
only the software corporation, and is detrimental to the end user.
On Structured Programming I fully agree with Mr. Keller and
hope that he clarified any misconceptions I left you with. I prefer
a bottom up construction, but that is only preference and has no
effect on the end product.
Dave's notes: Mr. Keller mentions that it is possible to get
great size/speed reductions, but that few programmers have the requisite
skills. But to a large extent, it isn't the skill that makes the
program, it's the toolbox. The C language is extremely close to
assembly - MSC does a very good job of optimizing - and it takes care of
the minutiae for you. The problem with this is that the libraries
;Page 11
supplied with the compilers were written to handle very general cases.
The printf() function is an extreme example, but it typifies the
problem: If you use printf once in your program to print "Hello", it
adds 30K of code!
Another concern is that many high-level-language programmers
don't even realize that with a tweak here, using putc instead
of printf there, they can get much(!) better performance from their
programs. Familiarity with the quirks of the compiler being used is a
necessity... And even that isn't enough to get good performance out of a
large program. AND, it decreases portability. So you're right back
into the twiddling usually associated only with assembly.
I've found that if I use C for anything except flow control and
one-shot tools, my programs start to get huge and slow, relative to
anything that I've banged out in assembly. The database is a great
example - it's a very complicated application, with a completely
separated data engine & OS interface. If it had been written in C, it
would be working in multiple code segments on a 286 with 4 megs and
STILL take hours to run a balance, instead of 35K of code on an XT
network terminal with half-hour runs.
The database was indeed a massive effort, but at this point it
would be possible to strip out the engine and write with ease (and
macros - lots of macros) anything that could be done in C or Dbase, and
do it much better. And average runtime is cut at least in half, size by
50-90%. With a reasonably solid and application-specific toolbox, the
advantages TO THE CUSTOMER of assembly programming completely eclipse
those of any other language and the disadvantages of assembly itself.
Portability is another issue entirely. If you NEED
portability and fast development, and IF run time and general
productivity are not a concern, then C probably makes more sense.
There's this nagging feeling, though, that if the UNIX OS core had been
written in assembly by a reasonably good programmer, and been ported to
new systems in kind, that the university systems would be clipping
instead of slogging.
As far as structured programming goes, I usually design as I go
along, and end up with a functional (even rational) structure. Call it
"random-access programming." This is probably because I find it
difficult to call a routine until I've laid out the calling conventions
for it, and while I'm doing that I'll remember another routine that
should be written for another module... This is not the generally
recommended method, I gather.
;Page 12
Accessing the Command Line Arguments
in Assembly Language Programs
By Thomas J. Keller
P.O. Box 14069
Santa Rosa, CA, 95402
If you're like me, you program in several languages, under
several different operating systems. Under DOS, one very useful
feature is the capability to pass arguments to a program as part of
the invocation command line. The use of command line arguments
significantly increases the power and flexibility of your programs,
as well as improving the "professional look." Many languages
support this capability with intrinsic or library routines which
facilitate access to these command line arguments. Assembly
language, of course, does not. What is a programmer to do?
As it turns out, it is quite simple to access the command line
arguments under DOS. DOS places the so-called "command tail" (the
command line less the actual program name) into a buffer area
reserved in the PSP (Program Segment Prefix). This buffer area is
known as the DTA (Disk Transfer Area).
It is extremely important that you parse the command tail, if you
plan to do so at all, immediately upon entering your program. DOS
does some particularly obscure and insidious things with this DTA
buffer, which will destroy the command tail information.
In a .COM format program, the PSP is the first 100h (256) bytes
of the program memory image, making access quite straightforward.
How do we locate the PSP in a .EXE format program, however?
Fortunately, DOS sets the ES segment register to point to the
beginning of the PSP under both .COM and .EXE programs. It happens
to be the case that DOS also sets all other segment registers to the
same location for a .COM program, simply because .COM programs
reside in one and only one segment. In an .EXE invocation, the DS
and ES registers are set to point to the segment in which the PSP
resides as the first 100h bytes. This is the default data segment
as well.
The DTA begins at offset 80h (128d) from the beginning of the
PSP. When it contains a command tail, the byte at 80h contains the
count of the number of bytes actually in the command tail, and the
command tail string begins at offset 81h (129d) from the beginning
of the PSP. The first byte of this string is always a blank (20h),
and the string is terminated with a (0dh).
The exact means you use to parse the command line arguments is,
of course, up to you. One possible approach is as follows:
1) Use the data definition directives to set aside any memory you
will need to store information about command line arguments
(e.g., buffers for file names, byte or word values for flags and
numeric arguments, etc.).
2) Design a routine that starts scanning the command tail string for
;Page 13
arguments. a 'first fit' (the shortest match possible) scheme is
easiest to program. As each item is located and identified as to
type and purpose, store the appropriate information in the data
areas you have already set aside.
3) Have a "usage" message defined, and a small routine to print it
to the screen (a good idea is to print it to STDERR). Invoke
this routine when the first argument on the command line is a
'?,' or, if the program requires arguments, when it is invoked
without them.
4) You now have the switches, filenames, and other command line
arguments available. Write your program to use them
appropriately.
Included in this issue of Assembly Language Magazine is an source
listing which is a sample template GPFILT.ASM for a general purpose
assembly language filter. This program provides an excellent sample
of command line argument parsing and one way of using these
arguments (though the method used here is not the same as the one
described above).
;Page 14
Original Vector Locator
by Rick Engle
November, 1989
INTTEST is a small assembly program which attempts to find the
original address of the INT 21h function handler. This is valuable
if you need to be able to make calls to the original INT 21h
function even if a TSR or other program has that interrupt hooked or
trapped. This gives your program secure control over the interrupt
regardless of who is using it.
I did this prototype in an attempt to make certain programs
somewhat immune to the effects of destructive viruses that may
intercept INT 21h and use it for their own use. This technique
could be used to find the original address of other MS-DOS
interrupts. I wrote test programs to dump out the address of MS-DOS
interrupts (such as INT 21h) and then disassembled portions of
MS-DOS at those addresses to identify a stable signature of the
interrupt. Then by following the chain to MS-DOS through the PSP
(Program Segment Prefix) at offset 5h, I was able to find the
segment:offset of the address of the handler for old CP/M calls.
This pointed to the correct segment in memory of MS-DOS and
from there, after moving my offset backwards about 100h in memory, I
scanned for my interrupt signature. Once I got a hit, I calculated
the address of the interrupt and then could make calls to INT 21h at
the segment:offset found. This program is a "brute-force" method of
finding the original address. If anyone finds or has a better way,
I'd be very interested in hearing about it.
NOTE: I have tested this program successfully on MS-DOS
2.11, 3.20, and 3.30.
~
; -----------------------------------------------------------------------
; INTTEST.ASM November, 1989 Rick Engle
;
; Finds the address of the INT 21h function dispatcher to
; allow the user to make INT 21h calls to the original
; interrupt regardless of who or what has INT 21h hooked.
;
; -----------------------------------------------------------------------
;
print macro print_parm
push ax
push dx
mov ah,9
mov dx,offset print_parm
int 21h
pop dx
pop ax
endm
; -----------------------------------------------------------------------
; - Start of program -
; -----------------------------------------------------------------------
;Page 15
cseg segment para public 'code'
assume cs:cseg,ds:cseg
org 100h
int_test proc far
print reboot_first
print int_address
mov cl,21h
mov ah,35h ; get interupt vector
mov al,cl ; for interupt in cl
int 21h ; do it
mov ax,es ; lets display the es
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon
mov ax,bx ; lets display the bx
mov di,offset out_byte
call conv_word
print out_byte
print crlf
print display_header2
mov ah,byte ptr cs:[05h] ; Get info from the PSP
mov al,byte ptr cs:[06h] ;
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[07h] ;
mov al,byte ptr cs:[08h] ;
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[09h] ;
mov al,byte ptr cs:[0ah] ;
mov di,offset out_byte
call conv_word
print out_byte
print crlf
print display_header
mov ah,byte ptr cs:[50h] ; Addess if INT 21 op code
mov al,byte ptr cs:[51h] ; in the PSP
;Page 16
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[52h] ;
mov al,byte ptr cs:[53h] ;
mov di,offset out_byte
call conv_word
print out_byte
print dash
mov ah,byte ptr cs:[54h] ;
mov al,byte ptr cs:[55h] ;
mov di,offset out_byte
call conv_word
print out_byte
print crlf
print far_address
mov ax,word ptr cs:[08h] ;
mov segm,ax
push cs ; set es = cs
pop es ;
mov di,offset out_byte
call conv_word
print out_byte
print colon
mov ax,word ptr cs:[06h] ;
mov off,ax
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf
mov ax,segm
mov es,ax
mov di,off
inc di
print function_jmp
mov ax,word ptr es:[di+2] ;
mov segm2,ax
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon
mov ax,segm
mov es,ax
mov di,off
;Page 17
inc di
mov ax,word ptr es:[di] ;
mov off,ax ; save found offset of int 21h
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf
;-----------------------------------------------------------------
;si = string di = string size es:bx = pointer to buffer to search
;ax = number of bytes in buffer to search. Zero flag set if found
;-----------------------------------------------------------------
mov ax,segm2
mov es,ax ;segment
mov bx,off ;offset
sub bx,0100h ;backup a bit to catch DOS
mov si,offset dos_sig ;start at modified byte
mov di,dos_sig_len ;enough of a match
mov ax,0300h ;# of bytes to search
call search ;use our search
jnz sig_not_found ;didn't find int 21h signature
mov START_SEGMENT,es ;set page
mov START_OFFSET,ax ;address of found string
print good_address
mov ax,START_SEGMENT ;
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print colon
mov ax,START_OFFSET ;
mov off,ax ; save found offset of int 21h
push cs ; set es = cs so that
pop es ; the stosb works
mov di,offset out_byte
call conv_word
print out_byte
print crlf
push cs ; set es = cs
pop es
mov bx,START_OFFSET
mov ax,START_SEGMENT
mov word ptr [OLDINT21], bx
mov word ptr [OLDINT21+2],ax
mov dx,offset test_message
mov ah,9
call dos_function
jmp terminate
sig_not_found:
;Page 18
print no_int21_found
terminate: mov ax,4c00h ; terminate process
int 21h ; and return to DOS
out_byte db 'XXXX'
db '$'
colon db ':$'
dash db '-$'
crlf db 10,13,'$'
reboot_first db 13,10,'INTTEST 1.0',13,10
db 'Reboot before running this, or',13,10
db 'make sure INT 21h is not hooked',13,10,13,10,'$'
display_header db 'HEX data at PSP address 50h is : $'
display_header2 db 'HEX data at PSP address 05h is : $'
int_address db 'Original INT 21h address is : $'
function_jmp db 'Jump address at DOS dispatcher : $'
far_address db 'Far address of DOS dispatcher : $'
good_address db 'Good INT 21h address found at : $'
test_message db 13,10,10,'This message is being printed using the INT '
db '21h Interrupt',13,10
db 'Found by Brute Force!!!!',13,10,10,'$'
no_int21_found db 13,10,'Int 21h address not found!$'
segm dw 0
segm2 dw 0
off dw 0
START_OFFSET dw 0 ;top addr shown on screen
START_SEGMENT dw 0
;dos_sig db 08Ah, 0E1h, 0EBh ; mov ah,cl
; ; jmp short label
dos_sig db 080h, 0FCh, 0F8h ; cmp ah,0F8h
dos_sig_len equ $ - dos_sig
OLDINT21 dd ? ; Old DOS function interrupt vector
int_test endp
; -----------------------------------------------------------------------
; - -
; - Subroutine to convert a word or byte to hex ASCII -
; - -
; - call with AX = binary value -
; - DI = address to store string -
; - -
; -----------------------------------------------------------------------
conv_word proc near
push ax
mov al,ah
call conv_byte ; convert upper byte
pop ax
call conv_byte ; convert lower byte
ret ; and return
conv_word endp
conv_byte proc near
push cx ; save cx
;Page 19
sub ah,ah ; clear upper byte
mov cl,16
div cl ; divide binary data by 16
call conv_ascii ; the quotient becomes the
stosb ; ASCII character
mov al,ah
call conv_ascii ; the remainder becomes the
stosb ; second ASCII character
pop cx ; restore cx
ret
conv_byte endp
conv_ascii proc near ; convert value 0-0Fh in al
add al,'0' ; into a "hex ascii" character
cmp al,'9'
jle conv_ascii_2 ; jump if in range 0-9
add al,'A'-'9'-1 ; offset it to range A-F
conv_ascii_2: ret ; return ASCII character in al
conv_ascii endp
;-----------------------------------------------------------------------
; This routine does a dos function by calling the old interrupt vector
;-----------------------------------------------------------------------
assume ds:nothing, es:nothing
dos_function proc
; mov cl,ah ;move our function # into cl
pushf ;These instructions simulate
;an interrupt
cli ;turn off interrupts
call CS:OLDINT21 ;Do the DOS function
sti ;enable interrupts
push cs
pop ds
push cs
pop es
ret
dos_function endp
;-----------------------------------------------------------------
;si = string di = string size es:bx = pointer to buffer to search
;ax = number of bytes in buffer to search. Zero flag set if found
;-----------------------------------------------------------------
SEARCH PROC NEAR ;si points at string
PUSH BX
PUSH DI
PUSH SI
XCHG BX,DI ;string size, ptr to data area
MOV CX,AX ;# chars in segment to search
BYTE_ADD:
LODSB ;char for first part of search
NEXT_SRCH:
REPNZ SCASB ;is first char in string in buffer
JNZ NOT_FOUND ;if not, no match
PUSH DI ;save against cmpsb
;Page 20
PUSH SI
PUSH CX
LEA CX,[BX-1] ;# chars in string - 1
JCXZ ONE_CHAR ;if one char search, we have found it
REP CMPSB ;otherwise compare rest of string
ONE_CHAR:
POP CX ;restore for next cmpsb
POP SI
POP DI
JNZ NEXT_SRCH ;if zr = 0 then string not found
NOT_FOUND:
LEA AX,[DI-1] ;ptr to last first character found
POP SI
POP DI
POP BX
RET ;that's all
SEARCH ENDP
cseg ends
end int_test
;Page 21
~
How to call DOS from within a TSR
by David O'Riva
Just a few ramblings on interactions between TSRs & DOS.
Cardinal rule: DON'T CALL DOS UNLESS YOU'RE SURE OF THE
MACHINE STATE!!!
There are a few interrupt calls and memory locations you can
play with to get this information. A list & explanation of sorts is
below. The reason you don't call DOS if you've interrupted the
machine in the middle of DOS is that:
1. The stack is unstable as far as DOS is concerned, and
you'll probably end up overwriting DOS data or going
into the weeds.
2. DOS only keeps one copy of certain crucial information
as it processes a disk-related request. i.e. BPB's,
current sectors, FAT memory images, fun stuff like that.
If you interrupt it in the middle, ask for something
different, then go back, you will probably destroy your
disk, possibly beyond recall.
3. DOS simply was not designed to be re-entrant. The first
9 or 10 function calls are cool most of the time, the
rest are strictly single-processing-stream functions.
However, there is hope. And, (extra bonus) it happens to be
compatible with most true MS-DOS releases, and many, many brand-name
DOSes. As well as most clones.
What you need to do is after determining that the user wants to
pop your program up, you set a few flags. One of them prevents your
program from being popped up AGAIN while the current DOS call is
completing, and the other tells a timer trap routine to start
looking for DOS to finish it's current process (usually a matter of
split seconds). When the timer routine detects that DOS is no
longer active, it grabs control of the system and runs your TSR.
At this point, all DOS calls are as safe as they are for a
normal application.
What follows is an outline of the code necessary to activate a
TSR that uses DOS calls. Depending on the TSR, other things may
need to be done in these routines as well. Definitely make sure you
understand the interactions of the various routines before TSRing
your background disk formatter.
Okay, nitty-gritty time...
;Page 22
You need 5 main chunks of code to do this right:
a) a bit of extra initialization code
b) your TSR's main program
c) activation request server (usually a keypress
trap)
d) timer tick inDOS monitor
e) DOS busy loop monitor
And here's what they do:
a) asks DOS for the location of the inDOS flag, and stores
that away.
b) does whatever you want it to.
c) when the activation requirement is sensed (the user
pressed the hot-key, the print buffer is empty, the modem
is sending another packet, whatever) the following steps
need to be taken: 1. have we already tried to activate,
and are waiting for DOS to finish? if so, then
ignore the activation request.
2. check the inDOS flag. if we're not in DOS, then
activate as usual.
3. set a flag indicating that the TSR wants to
activate, but can't right now
4. return to DOS
d) this is linked in AFTER interrupt 08 - that is, when this
interrupt happens, call the original INT 08 handler, then
run your checking code:
1. does the TSR want to run? if not, return from the
interrupt.
2. check the inDOS flag. If it's out of DOS, then
run your code as normal
3. return from the interrupt
* * NOTE: This code has to run FAST. If it's poorly coded,
you may very well see downgraded performance of the
entire system.
e) link in to the DOS keyboard busy loop - INT 28. This
interrupt is called when DOS is waiting for a
keystroke via functions 1,3,7,8,0A, and 0C. If the TSR
takes control from this loop, then DOS functions ABOVE 0C
are safe to use. Functions 0 - 0C are NOT safe to use.
;Page 23
1. Does the TSR want to run? if not, continue down
the interrupt chain.
2. run the TSR as usual
3. continue down the interrupt chain.
NOTES: The first action your main TSR code should take is to
clear the flag that indicates the TSR is trying to run. If
this is not done, your TSR will re-enter itself at least
18.2 times per second... i.e. a MESS.
The last action your main TSR code should take before
leaving is to RESET the flags that prevent the TSR from
being activated. If you forget to do this, your TSR will
run once, then never again... I know from personal
experience that this is frustrating to a dangerous degree.
Some of this code is really complicated, so don't get
discouraged if it takes a few days of tweaking and
hair-pulling to get it right.
All numbers in this text are in hex.
The timer tick routine is really touchy, at least the way I
wrote it. Be very sure yours is reliable if you distribute
a program with this structure.
The reason that functions 0-0C are separated from the rest
of the DOS calls as far as re-entrancy is concerned is that
they use an entirely separate stack frame. I believe this
must have been done specifically for the purpose of helping
TSR writers.
Does anyone know why the hell Microsoft built these neat
functions into DOS and then refused to acknowledge their
existence?
INTERRUPT & FUNCTION CALLS
INT 08
Timer tick interrupt. Called 18.2 times a second on IRQ 0.
The interrupt is triggered by timer 0.
INT 21, FUNCTION 34
inDOS flag address request. This function returns the
address of the "inDOS flag" as a 32 bit pointer in ES:BX.
The inDOS flag is a byte that is zero when DOS is not
processing a function request, and is non-zero when DOS is
in a function.
;Page 24
NOTE: This function is officially specified as RESERVED.
It's use could change in future versions of DOS, and it can
only be guaranteed to work in straight IBM PC-DOS or MS-DOS
versions 2.0 to 3.30. Use at your own risk.
INT 28
DOS keyboard busy loop. This interrupt is called when DOS is
waiting for a keystroke in the console input functions.
When this interrupt is issued, it is safe to use any DOS
call ABOVE 0C. Calls to DOS functions 0 - 0C will trash the
stack and do nasty things.
NOTE: This function is officially RESERVED. See the note
for function 34 above.
AUTHOR'S NOTE:
First, the references I listed are really great. They've
helped me out a lot over the past few years. Second, if your
hard disk gets munched by your TSR, read the disclaimer.
! ! ! ! ! ! C A V E A T P R O G R A M M E R ! ! ! ! ! !
& Disclaimer
The techniques described in here are, for the most part,
UNDOCUMENTED by Microsoft or IBM. This means that you CAN NOT BE
SURE that they will work on all IBM clones, and could even cause
crashes on some! The timer tick interrupt provides some essential
system services, and messing with it incautiously can wreak havoc.
The program outlines presented here are what worked for me on
my system, and what should work on about 90% of the clones out
there. However, I still suggest that you find a reference for all
of the interrupts and functions described here. This file is meant
to be a guideline and aid only.
REFERENCES:
DOS Programmer's Reference, by Terry R. Dettmann.
$22.95, QUE Corporation
IBM DOS Technical Reference, version 3.30
$(?), International Business Machines Corp.
I can't remember how much it cost...
;Page 25
Environment Variable Processor
by David O'Riva
~
PAGE 60,132
TITLE Q43.ASM - editor prelude & display manager
;
;
COMMENT~***********************************************************************
* ---===> All code in this file is copyright 1989 by David O'Riva <===--- *
*******************************************************************************
* *
* The above line is there only to prevent people (or COMPANIIES) from *
* claiming original authorship of this code and suing me for using it. *
* You're welcome to use it anyhow you care to. *
* *
*
* Environment Variable Finder & Processor - *
* *
* The "get_environment_variable" routine is complete in itself, and can *
* be extracted and used in anything else that needs one. Just copy the entire
* routine, from the header to the endp (don't forget the RADIX and DW).
* Theroutine currently uses 315 (decimal) bytes.
*
*
* This program's purposeis to invoke an editor (or any program, really,
* with a specific machine state depending on environment variables. (Yeah!!!)
* Currently it is set up to change my screen to one of various modes, with
* the variable ED_SCRMODE being set to:
* 100/75 = 100 columns by 75 lines
* 132/44 = 132 by 44
* 80/44 = 80 by 44
* ...and then to EXEC my editor (qedit) with that mode set. You could
* set the screen back to the standard 80x25 after the EXEC returns.
*
* Note: The 80/44 set code should work on most (ar all?) EGAs. The
* other two high-res text modes use built-in extended BIOS modes in my
* Everex EV-657 EGA card (the 800x600 version) w/multisync monitor. If you've
* got one of those, you're in luck - no mods needed. It will also work on the
* EV-673 EVGA card w/appropriate monitor.
*
*
* Note to BEGINNERS: This is not an example of "good" asm code. This
* file is an example of what happens when you're up at 1:00am with too much
* coffee and a utility that needs to be fixed.
*
*
*
* This is a COM program, not an EXE. Remember to use EXE2BIN.
*
*
*
******************************************************************************~
; ; TRUE EQU 0FFH FALSE EQU 0 ;
;******************************************************************************
; CODE SEGMENT PARA PUBLIC 'CODE' ASSUME
CS:CODE,DS:CODE,ES:CODE,SS:CODE ; MAIN PROC NEAR
;Page 26
ORG 100H
entry:
;------------------------------------------------------------------------------
; set the screen to the correct mode
;------------------------------------------------------------------------------
call set_screen_mode
;------------------------------------------------------------------------------
; check for pathname change in environment
;------------------------------------------------------------------------------
call set_exec_name
;------------------------------------------------------------------------------
; setup memory and run the program
;------------------------------------------------------------------------------
MOV BX,OFFSET ENDRESIDENT ;deallocate unnecessary memory
MOV CL,4
SHR BX,CL
INC BX
MOV AH,04AH
INT 021H
MOV AX,CS ;exec the program
MOV INSERT_CS1,AX
MOV INSERT_CS2,AX
MOV INSERT_CS3,AX
MOV AX,04B00H
MOV BX,OFFSET EXECPARMS
MOV DX,OFFSET PROGNAME
INT 021H
;------------------------------------------------------------------------------
; clean up and leave
;------------------------------------------------------------------------------
MOV AH,04DH ;get return code from program
INT 021H
MOV AH,04CH ;leave
INT 021H
;
;******************************************************************************
;
; data
;
PROGNAME DB 'F:\UTILITY\MISC\Q.EXE',0
db 100 dup(' ')
EXECPARMS DW 0 ;use current environment
DW 080H ;use current command tail
INSERT_CS1 DW ?
DW 05CH ;use current FCB's
INSERT_CS2 DW ?
DW 06CH
INSERT_CS3 DW ?
;Page 27
ENDRESIDENT:
;******************************************************************************
; more data - used only for setup & checks
;
valid_modes db '80/44 '
db '132/44'
db '100/75'
screen_mode db ' '
mode_jump dw goto_43
dw goto_132
dw goto_100
ev_mode db 'ED_SCRMODE',0
ev_pathname db 'ED_PATH',0
PAGE
;******************************************************************************
; set_screen_mode -
;
;
; ENTRY:
;
; EXIT:
;
; DESTROYED:
;
;------------------------------------------------------------------------------
set_screen_mode:
MOV AH,012H ;check for presence of EGA/VGA
MOV BL,010H
INT 010H
CMP BL,010H ;BL changed? (should have # of
; bytes of EGA memory)
JE ssm_no_ega ;This is no EGA!
;------------------------------------------------------------------------------
; check environment for correct mode set -
; don't set mode if none specified
;------------------------------------------------------------------------------
mov si,offset ev_mode
mov di,offset screen_mode
mov cx,6 ;accept 6 chars
mov ax,4 ;get fixed-length string
call get_environment_variable
and ax,0feh
jne ssm_no_env_mode
;------------------------------------------------------------------------------
; look up the variable's value in my mode table
;------------------------------------------------------------------------------
mov bx,0
mov di,offset valid_modes
;Page 28
ssm_check_mode: mov dx,di
mov si,offset screen_mode
mov cx,6
repe cmpsb
je ssm_found_mode
mov di,dx
add di,6
inc bx
cmp bx,3
jne ssm_check_mode
jmp ssm_bad_mode
;------------------------------------------------------------------------------
; set the correct screen mode
;------------------------------------------------------------------------------
ssm_found_mode: shl bx,1
jmp mode_jump[bx]
goto_100: mov ax,0070h
mov bx,8
int 010h
jmp ssm_leave
goto_132: mov ax,0070h
mov bx,0bh
int 010h
jmp ssm_leave
goto_43: MOV AX,3
INT 010H
MOV AX,01112H ;set to 8x8 chars (43/50 lines)
MOV BL,0
INT 010H
ssm_no_env_mode:
ssm_bad_mode:
ssm_no_ega:
ssm_leave:
ret
PAGE
;******************************************************************************
; set_exec_name -
;
;
; ENTRY:
;
; EXIT:
;
; DESTROYED:
;
;------------------------------------------------------------------------------
set_exec_name:
;
; If you want, write a chunk here that will read an alternate pathname
; for the editor to be executed from a different variable (like ED_PATH)
;Page 29
; I was going to do it, but ran out of time and need. (My editor never wanders
; around!)
;
ret
PAGE
;******************************************************************************
; Get_environment_variable -
;
;
;
;
; ENTRY: ds:[si] -> ASCIIZ environment variable name
; ds:[di] -> (up to) 129 byte buffer for string
; es = segment of program's PSP
; cx = maximum # of characters to accept
; al = variable return format
; 0 - return string in ASCIIZ format
; xxxxx 0 ........
;
; 1 - return string in DOS string ('$' terminated) format
; xxxxxxxx $ ........
;
; 2 - return string in DOS input buffer format
; maxchrs,numchrs,xxxxxxxx CR ............
;
; 3 - return string in command tail format
; numchrs,xxxxxxxxxxx CR ..........
;
; 4 - return string in fixed-length (CX chars) format
; xxxxxx
;
; EXIT: al = return codes:
; bit 0 - if set, string was longer than max, truncated
; 1 - if set, string did not exist
; 2 - if set, invalid return format requested
;
; DESTROYED: ah is undefined
;
;------------------------------------------------------------------------------
.RADIX 010h
gev_flags dw ?
Get_environment_variable:
push bx
push cx
push dx
push si
;Page 30
push di
push es
mov cs:gev_flags,ax
mov es,es:[02c] ;es -> program's environment
;------------------------------------------------------------------------------
; make sure the environment has at least one variable in it
;------------------------------------------------------------------------------
mov ax,es:[0]
cmp ax,0
jne gev_exists
mov ax,2
jmp gev_leave
;------------------------------------------------------------------------------
; find length of search string
;------------------------------------------------------------------------------
gev_exists: push cx
push di
mov di,si
mov cx,0ffff
gev_sourcelen:
inc cx
mov al,[di]
inc di
cmp al,0
jne gev_sourcelen
cmp cx,0
jne gev_startfind
pop di
pop cx
mov ax,2
jmp gev_leave
;------------------------------------------------------------------------------
; find string
;------------------------------------------------------------------------------
gev_startfind: mov bx,cx
mov dx,si
mov di,0
gev_checknext:
mov cx,bx
mov si,dx
repe cmpsb
je gev_found?
gev_tonextvar: mov cx,0ffff
mov al,0
repne scasb
cmp es:[di],al
jne gev_checknext
mov ax,2
pop di
pop cx
jmp gev_leave
;Page 31
gev_found?: cmp byte ptr es:[di],'='
jne gev_tonextvar
;------------------------------------------------------------------------------
; found the string in the environment
;------------------------------------------------------------------------------
gev_found: inc di
mov si,di
pop di
pop cx
cmp cs:gev_flags,1
ja gev_ibufform
;------------------------------------------------------------------------------
; move normal string with 0 or $ terminator
;------------------------------------------------------------------------------
gev_nextchar0: mov al,es:[si]
cmp al,0
je gev_setterm0
mov ds:[di],al
inc si
inc di
dec cx
jne gev_nextchar0
mov al,es:[si]
cmp al,0
je gev_setterm0
mov al,1
gev_setterm0: cmp cs:gev_flags,0
jne gev_setterm1
mov byte ptr ds:[di],0 ;ASCIIZ string
jmp gev_leave
gev_setterm1: mov byte ptr ds:[di],'$' ;DOS string
jmp gev_leave
;------------------------------------------------------------------------------
; move string into DOS input buffer format (int 21 function 0A)
;------------------------------------------------------------------------------
gev_ibufform: cmp cs:gev_flags,2
jne gev_ctailform
mov ds:[di],cl ;set max length
inc di
mov bx,di
inc di
mov dx,0
gev_nextchar2: mov al,es:[si]
cmp al,0
je gev_setterm2
mov ds:[di],al
inc si
inc di
inc dx
dec cx
jne gev_nextchar2
mov al,es:[si]
cmp al,0
;Page 32
je gev_setterm2
mov al,1
gev_setterm2: mov byte ptr ds:[di],0d ;add carriage return
mov ds:[bx],dl ;set actual # of chars
jmp gev_leave
;------------------------------------------------------------------------------
; move string into command tail format
;------------------------------------------------------------------------------
gev_ctailform: cmp cs:gev_flags,3
jne gev_fixedform
mov bx,di
inc di
mov dx,0
gev_nextchar3: mov al,es:[si]
cmp al,0
je gev_setterm3
mov ds:[di],al
inc si
inc di
inc dx
dec cx
jne gev_nextchar3
mov al,es:[si]
cmp al,0
je gev_setterm3
mov al,1
gev_setterm3: mov byte ptr ds:[di],0d ;set carriage return
mov ds:[bx],dl ;set # of bytes
jmp gev_leave
;------------------------------------------------------------------------------
; move string into fixed-length area (pad it out with spaces)
;------------------------------------------------------------------------------
gev_fixedform: cmp cs:gev_flags,4
jne gev_badform
gev_nextchar4: mov al,es:[si]
cmp al,0
je gev_padout4
mov ds:[di],al
inc si
inc di
dec cx
jne gev_nextchar4
mov al,es:[si]
cmp al,0
je gev_setterm4
mov al,1
jmp gev_setterm4
gev_padout4: mov byte ptr ds:[di],' '
inc di
dec cx
jne gev_padout4
;Page 33
mov al,0
gev_setterm4: jmp gev_leave
gev_badform: mov ax,4
gev_leave: pop es
pop di
pop si
pop dx
pop cx
pop bx
ret
.RADIX 00ah
MAIN ENDP
;
;******************************************************************************
;
CODE ENDS
;
;******************************************************************************
;
END ENTRY
~
;Page 34
Program Reviews
Multi-Edit ver 4.00a (demo version): Reviewed by Patrick O'Riva.
Multi-Edit is a high feature text editor with many word
processor features. The demo version is completely functional though
some of the reference material is not supplied and there are
advertising screens. I consider this fully acceptable as shareware.
The complete version with the macro reference library is available
for 79.95 and an expanded version with a spelling checker,
integrated Communication terminal and phone book is $179.95.
I couldn't list all of its features here, but in addition to
everything you have come to expect in a quality programming editor
(multi meg files, programmable keyboard etc.) there are a number of
powerful additions you might not expect. The word processor
functions rival most of the specialty ones that I've tried. It
won't compete with the major names for those of you who are addicted
to them, but it does offer full printer support, preview file, table
of contents generation, and extension keyed formatting. It will
right or left justify, and supports headers and footers, and auto
pagination.
It contains a calculator and an Ascii table
Saving the best for last: The language support is very strong.
It has built in templates for many common constructs, and the
assembler/compiler is invoked from within the editor with a single
key. It will read the error table generated by a variety of software
and with successive key presses move you to each line where an error
was found.
Something which I found unique is Multi-Edit's help system. It
is a hypertext system, and is wonderfully context sensitive most
everywhere in the system. From the Help menu it has a complete table
of contents and index. It is also fully user extendible. I have
integrated a database I have documenting the full set of interrupts
that totals about 400k and the documentation on my spelling checker
as well (which integrated into Multi-Edit almost seamlessly).
In many ways this is the best editor I've ever used, but it
does have a few faults, some of which are very subtle and may not
even be problems to most users. It is a 'tad' slower that what I'm
used to with Qedit. This is seldom noticed except in the execution
of complex macros. It is quite slow in paging through long files.
There are some true bugs in this version such as a crash of the
program (but not the data or the system) when large deletes from
large files are made. Multi-edit's treatment of file windows while
very versatile is slightly different and may take some time to get
used to.
For all of its advantages, until putting this Magazine
together, I still found myself reverting to Qedit for the speed and
ease of use. It is the first software that has made this anything
other than an exercise in frustration.
;Page 35
SHEZ
Just a quick mention because it isn't programming related.
Shez is a compression shell along the lines of ArcMaster and Qfiler.
It is a fine and versatile piece of programming, supporting all
common compression types. The more recent versions have virus
detection when used with the SCANV programs by John McAfee.
4DOS
This is a program that is an absolute joy to use. It is a
complete and virtually 100% compatible replacement for Command.com.
The code size is just slightly larger than MSDOS 3.3 command.com but
the added and enhanced functions save many times that amount in
TSR's you no longer need to install. Just to mention a few features:
An alias command whereby you can assign whatever mnemonic you
wish to a command or string of commands. Select is a screen
interface that allows you to mark files for use with a command.
Except will execute a command for a set of files excluding one or
more. There is an environment editor, built in Help, command
and filename completion, Global that will execute through the
directory tree, A Timer to keep track of elapsed time, as well as
many enhanced batch file commands. Additional features are too
numerous to mention. The current version is 4.23 and is available as
Shareware, but you should register after your first 10 minutes of
use. You will be hooked forever.
The above 3 programs should all be available on your local BBS's.
Please be sure and register programs you use.
;Page 36
Book Reviews
Assembly Language Quick Reference
by Allen L. Wyatt, Sr.
Reviewed by George A. Stanislav
This 1989 book published by QUE is a nice and handy reference for
assembly language programmers.
Instruction sets for six microprocessors and numeric coprocessors
are listed:
8086/8088 8087
80286 80287
80386 80387
I could find no reference to the 80186 microprocessor, not even a
suggestion that it uses the 80286 instruction set but does not
multitask. Because the 80186 was the brain of Tandy 2000, quite a
popular computer in its own time, its omission from the book is
surprising.
There is no division into chapters. This makes it somewhat hard to
figure out where the instruction sets of individual processors
start. Each higher processor set contains only the list of
instructions that are new for the processor or that changed
somewhat.
After a brief introduction, the book starts by listing,
alphabetically, all 8086/8088 instructions. The listing itself is
very well done. Each instruction stands out graphically from the
rest of the text. For every code there is some classification, e.g.
arithmetic, bit manipulation, data-transfer.
This is followed by a very brief description ended with a colon.
Next, a more detailed explanation gives sufficient information to
any assembly language programmer what the instruction does.
If applicable, the book lists flags affected by the instruction.
Most instructions also contain some coding examples.
The 8086/8088 instruction set is followed by the 80286 set, or
rather subset as it only contains the instructions new to this
microprocessor. Similarly, the 80386 section contains only those
instructions not found in the 8086/8088 and 80286 sections as well
as those that changed somewhat.
I find it puzzling that among those instructions considered changed
in the 80386 microprocessor we can find AND, NEG, POP - because they
can be used as 32-bit instructions in addition to their original
usage - but cannot find JE, JNE, and all other conditional jumps.
These did indeed change in the 80386 processor inasmuch they can be
used either as SHORT or as NEAR while on the older microprocessors
they could only jump within the SHORT range.
;Page 37
The rest of the book contains instructions for the math
coprocessors, the 8087, 80287 and 80387. This section is divided in
the same way as the microprocessor part, i.e. describing first the
8087 set, then the one new instruction for the 80286, followed by
the new 80387 instructions.
There are several possibilities of improvement QUE might consider
for future editions of this book:
o Make it easier to find the start of each section by color
coding the side of the paper;
o Include references to the instructions of the older
processors within the listing for the new processors.
Small print of the instruction with the page number where
a more detailed description can be found would be a nice
enhancement;
o At least a brief mention of the 80186 microprocessor and
perhaps the V-20 and V-30 would be useful.
Despite the possibility of improvement, this is an excellent
reference for any assembly language programmer. Its small size makes
it very handy to keep it next to the computer as well as to take it
along when travelling.
The book costs $6.95 in USA and $8.95 in Canada.
;Page 38
GPFILT.ASM
~
page ,132
TITLE GPFILT
subttl General Purpose Filter Template
;
; GPFILT.ASM
; This file contains a template for a general-purpose assembly language
; filter program.
;
; Fill in the blanks for what you wish to do. The program is set up to
; accept a command line in the form:
; COMMAND [{-|/}options] [infile [outfile]]
;
; If infile is not specified, stdin is used.
; If outfile is not specified, stdout is used.
;
; To compile and link:
; MASM GPFILT ;
; LINK GPFILT ;
; EXE2BIN GPFILT GPFILT.COM
;
; Standard routines supplied in the general shell are:
;
; get_arg - returns the address of the next command line argument in
; DX. Since this is a .COM file, the routine assumes DS will
; be the same as the command line segment.
; The routine will return with Carry set when it reaches the end
; of the command line.
;
; err_msg - displays an ASCIIZ string on the STDERR device. Call with the
; address of the string in ES:DX.
;
; do_usage- displays the usage message on the STDERR device and exits
; with an error condition (errorlevel 1). This routine will
; never return.
;
; getch - returns the next character from the input stream in AL.
; It will return with carry set if an error occurs during read.
; It will return with the ZF set at end of file.
;
; putch - writes a character from AL to the output stream. Returns with
; carry set if a write error occurs.
;
cseg segment
assume cs:cseg, ds:cseg, es:cseg, ss:cseg
org 0100h ;for .COM files
start: jmp main ;jump around data area
;
; Equates and global data area.
;
; The following equates and data areas are required by the general filter
; routines. User data area follows.
;
;Page 39
STDIN equ 0
STDOUT equ 1
STDERR equ 2
STDPRN equ 3
cr equ 0dh
lf equ 0ah
space equ 32
tab equ 9
infile dw STDIN ;default input file is stdin
outfile dw STDOUT ;default output file is stdout
errfile dw STDERR ;default error file is stderr
prnfile dw STDPRN ;default print file is stdprn
cmd_ptr dw 0081h ;address of first byte of command tail
PSP_ENV equ 002ch ;The segment address of the environment
;block is stored here.
infile_err db cr, lf, 'Error opening input file', 0
outfile_err db cr, lf, 'Error opening output file', 0
aborted db 07, cr, lf, 'Program aborted', 0
usage db cr, lf, 'Usage: ', 0
crlf db cr, lf, 0
;************************************************************************
;* *
;* Buffer sizes for input and output files. The buffers need not be *
;* the same size. For example, a program that removes tabs from a text *
;* file will output more characters than it reads. Therefore, the *
;* output buffer should be slightly larger than the input buffer. In *
;* general, the larger the buffer, the faster the program will run. *
;* *
;* The only restriction here is that the combined size of the buffers *
;* plus the program code and data size cannot exceed 64K. *
;* *
;* The easiest way to determine maximum available buffer memory is to *
;* assemble the program with minimum buffer sizes and examine the value *
;* of the endcode variable at the end of the program. Subtracting this *
;* value from 65,536 will give you the total buffer memory available. *
;* *
;************************************************************************
;
INNBUF_SIZE equ 31 ;size of input buffer (in K)
OUTBUF_SIZE equ 31 ;size of output buffer (in K)
;
;************************************************************************
;* *
;* Data definitions for input and output buffers. DO NOT modify these *
;* definitions unless you know exactly what it is you're doing! *
;* *
;************************************************************************
;
; Input buffer
ibfsz equ 1024*INNBUF_SIZE ;input buffer size in bytes
inbuf equ endcode ;input buffer
ibfend equ inbuf + ibfsz ;end of input buffer
;
;Page 40
; ibfptr is initialized to point past end of input buffer so that the first
; call to getch will result in a read from the file.
;
ibfptr dw inbuf+ibfsz
; output buffer
obfsz equ 1024*OUTBUF_SIZE ;output buffer size in bytes
outbuf equ ibfend ;output buffer
obfend equ outbuf + obfsz ;end of output buffer
obfptr dw outbuf ;start at beginning of buffer
;************************************************************************
;* *
;* USER DATA AREA *
;* *
;* Insert any data declarations specific to your program here. *
;* *
;* NOTE: The prog_name, use_msg, and use_msg1 variables MUST be *
;* defined. *
;* *
;************************************************************************
;
; This is the program name. Under DOS 3.x, this is not used because we
; can get the program name from the environment. Prior to 3.0, this
; information is not supplied by the OS.
;
prog_name db 'GPFILT', 0
;
; This is the usage message. The first two lines are required.
; The first line is the programs title line.
; Make sure to include the 0 at the end of the first line!!
; The second line shows the syntax of the program.
; Following lines (which are optional), are discussion of options, features,
; etc...
; The message MUST be terminated by a 0.
;
use_msg db ' - General Purpose FILTer program.', cr, lf, 0
use_msg1 label byte
db '[{-|/}options] [infile [outfile]]', cr, lf
db cr, lf
db 'If infile is not specified, STDIN is used', cr, lf
db 'If outfile is not specified, STDOUT is used', cr, lf
db 0
;
;************************************************************************
;* *
;* The main routine parses the command line arguments, opens files, and *
;* does other initialization tasks before calling the filter procedure *
;* to do the actual work. *
;* For a large number of filter programs, this routine will not need to *
;* be modified. Options are parsed in the get_options proc., and the *
;* filter proc. does all of the 'filter' work. *
;* *
;************************************************************************
;
main: cld
call get_options ;process options
;Page 41
jc gofilter ;carry indicates end of arg list
mov ah,3dh ;open file
mov al,0 ;read access
int 21h ;open the file
mov word ptr ds:[infile], ax ;save file handle
jnc main1 ;carry clear indicates success
mov dx,offset infile_err
jmp short err_exit
main1: call get_arg ;get cmd line arg in DX
jc gofilter ;carry indicates end of arg list
mov ah,3ch ;create file
mov cx,0 ;normal file
int 21h ;open the file
mov word ptr ds:[outfile],ax ;save file handle
jnc gofilter ;carry clear indicates success
mov dx,offset outfile_err
jmp short err_exit
gofilter:
call filter ;do the work
jc err_exit ;exit immediately on error
mov ah,3eh
mov bx,word ptr [infile]
int 21h ;close input file
mov ah,3eh
mov bx,word ptr [outfile]
int 21h ;close output file
mov ax,4c00h
int 21h ;exit with no error
err_exit:
call err_msg ;output error message
mov dx,offset aborted
call err_msg
mov ax,4c01h
int 21h ;and exit with error
;
;************************************************************************
;* *
;* get_options processes any command line options. Options are *
;* preceeded by either - or /. There is a lot of flexibility here. *
;* Options can be specified separately, or as a group. For example, *
;* the command "GPFILT -x -y -z" is equivalent to "GPFILT -xyz". *
;* *
;* This routine MUST return the address of the next argument in DX or *
;* carry flag set if there are no more options. In other words, return *
;* what was returned by the last call to get_arg. *
;* *
;************************************************************************
;
get_options proc
call get_arg ;get command line arg
jnc opt1
; If at least one argument is required, use this line
; call do_usage ;displays usage msg and exits
; If there are no required args, use this line
ret ;if no args, just return
opt1: mov di, dx
mov al,byte ptr ds:[di]
;Page 42
cmp al,'-' ;if first character of arg is '-'
jz opt_parse
cmp al,'/' ;or '/', then get options
jz opt_parse
ret ;otherwise exit
opt_parse:
inc di
mov al,byte ptr ds:[di]
or al,al ;if end of options string
jz nxt_opt ;get cmd. line arg
cmp al,'?' ;question means show usage info
jz do_usage
;
;************************************************************************
;* *
;* Code for processing other options goes here. The current option *
;* character is in AL, and the remainder of the option string is pointed*
;* to by DS:DI. *
;* *
;************************************************************************
;
jmp short opt_parse
nxt_opt:
call get_arg ;get next command line arg
jnc opt1 ;if carry
vld_args: ;then validate arguments
;
;************************************************************************
;* *
;* Validate arguments. If some options are mutually exclusive/dependent*
;* use this area to validate them. Whatever the case, if you must *
;* abort the program, call the do_usage procedure to display the usage *
;* message and exit the program. *
;* *
;************************************************************************
;
ret ; no more options
;
;************************************************************************
;* *
;* Filter does all the work. Modify this routine to do what it is you *
;* need done. *
;* *
;************************************************************************
;
filter proc
call getch ;get a character from input into AL
jbe filt_done ;exit on error or EOF
and al, 7fh ;strip the high bit
call putch ;and output it
jc filt_ret ;exit on error
jmp short filter
filt_done:
jc filt_ret ;carry set is error
call write_buffer ;output what remains of the buffer
filt_ret:
;Page 43
ret
filter endp
;
;************************************************************************
;* *
;* Put any program-specific routines here *
;* *
;************************************************************************
;
;************************************************************************
;* *
;* For most programs, nothing beyond here should require modification. *
;* The routines that follow are standard routines used by almost every *
;* filter program. *
;* *
;************************************************************************
;
;************************************************************************
;* *
;* This routine outputs the usage message to the STDERR device and *
;* aborts the program with an error code. A little processing is done *
;* here to get the program name and format the output. *
;* *
;************************************************************************
;
do_usage:
mov dx, offset crlf
call err_msg ;output newline
mov ah,30h ;get DOS version number
int 21h
sub al,3 ;check for version 3.x
jc lt3 ;if carry, earlier than 3.0
;
; For DOS 3.0 and later the full pathname of the file used to load this
; program is stored at the end of the environment block. We first scan
; all of the environment strings in order to find the end of the env, then
; scan the load pathname looking for the file name.
;
push es
mov ax, word ptr ds:[PSP_ENV]
mov es, ax ;ES is environment segment address
mov di, 0
mov cx, 0ffffh ;this ought to be enuf
xor ax, ax
getvar: scasb ;get char
jz end_env ;end of environment
gv1: repnz scasb ;look for end of variable
jmp short getvar ;and loop 'till end of environment
end_env:
inc di
inc di ;bump past word count
;
; ES:DI is now pointing to the beginning of the pathname used to load the
; program. We will now scan the filename looking for the last path specifier
; and use THAT address to output the program name. The program name is
; output WITHOUT the extension.
;Page 44
;
mov dx, di
fnloop: mov al, byte ptr es:[di]
or al, al ;if end of name
jz do30 ;then output it
inc di
cmp al, '\' ;if path specifier
jz updp ;then update path pointer
cmp al, '.' ;if '.'
jnz fnloop
mov byte ptr es:[di-1], 0 ;then place a 0 so we don't get ext
jmp short fnloop ; when outputting prog name
updp: mov dx, di ;store
jmp short fnloop
;
; ES:DX now points to the filename of the program loaded (without extension).
; Output the program name and then go on with rest of usage message.
;
do30: call err_msg ;output program name
pop es ;restore
jmp short gopt3
;
; We arrive here if the current DOS version is earlier than 3.0. Since the
; loaded program name is not available from the OS, we'll output the name
; entered in the 'prog_name' field above.
;
lt3: mov dx, offset prog_name
call err_msg ;output the program name
;
; After outputting program name, we arrive here to output the rest of the
; usage message. This code assumes that the usage message has been
; written as specified in the data area.
;
gopt3: mov dx, offset use_msg
call err_msg ;output the message
mov dx, offset usage
call err_msg
mov dx, offset use_msg1
call err_msg
mov ax,4c01h
int 21h ;and exit with error
get_options endp
;
;************************************************************************
;* *
;* Output a message (ASCIIZ string) to the standard error device. *
;* Call with address of error message in ES:DX. *
;* *
;************************************************************************
;
err_msg proc
cld
mov di,dx ;string address in di
mov cx,0ffffh
xor ax,ax
repnz scasb ;find end of string
;Page 45
xor cx,0ffffh
dec cx ;CX is string length
push ds
mov ax,es
mov ds,ax ;DS is segment address
mov ah,40h
mov bx,word ptr cs:[errfile]
int 21h ;output message
pop ds
ret
err_msg endp
;
;************************************************************************
;* *
;* getch returns the next character from the file in AL. *
;* Returns carry = 1 on error *
;* ZF = 1 on EOF *
;* Upon exit, if either Carry or ZF is set, the contents of AL is *
;* undefined. *
;* *
;************************************************************************
;
; Local variables used by the getch proc.
eof db 0 ;set to 1 when EOF reached in read
last_ch dw ibfend ;pointer to last char in buffer
getch proc
mov si,word ptr ds:[ibfptr] ;get input buffer pointer
cmp si,word ptr ds:[last_ch];if not at end of buffer
jz getch_eob
getch1: lodsb ;character in AL
mov word ptr ds:[ibfptr],si ;save buffer pointer
or ah,1 ;will clear Z flag
ret ;and done
getch_eob: ;end of buffer processing
cmp byte ptr ds:[eof], 1 ;end of file?
jnz getch_read ;nope, read file into buffer
getch_eof:
xor ax, ax ;set Z to indicate EOF
ret ;and return
getch_read: ; Read the next buffer full from the file.
mov ah,3fh ;read file function
mov bx,word ptr ds:[infile] ;input file handle
mov cx,ibfsz ;#characters to read
mov dx,offset inbuf ;read into here
int 21h ;DOS'll do it for us
jc read_err ;Carry means error
or ax,ax ;If AX is zero,
jz getch_eof ;we've reached end-of-file
add ax,offset inbuf
mov word ptr ds:[last_ch],ax;and save it
mov si,offset inbuf
jmp short getch1 ;and finish processing character
;Page 46
read_err: ;return with error and...
mov dx,offset read_err_msg ; DX pointing to error message string
ret
read_err_msg db 'Read error', cr, lf, 0
getch endp
;
;************************************************************************
;* *
;* putch writes the character passed in AL to the output file. *
;* Returns carry set on error. The character in AL is retained. *
;* *
;************************************************************************
;
putch proc
mov di,word ptr ds:[obfptr] ;get output buffer pointer
stosb ;save the character
mov word ptr ds:[obfptr],di ;and update buffer pointer
cmp di,offset obfend ;if buffer pointer == buff end
clc
jnz putch_ret
push ax
call write_buffer ;then we've got to write the buffer
pop ax
putch_ret:
ret
putch endp
;
;************************************************************************
;* *
;* write_buffer writes the output buffer to the output file. *
;* This routine should not be called except by the putch proc. and at *
;* the end of all processing (as demonstrated in the filter proc). *
;* *
;************************************************************************
;
write_buffer proc ;write buffer to output file
mov ah, 40h ;write to file function
mov bx, word ptr ds:[outfile];output file handle
mov cx, word ptr ds:[obfptr]
sub cx, offset outbuf ;compute #bytes to write
mov dx, offset outbuf ;from this buffer
int 21h ;DOS'll do it
jc write_err ;carry is error
or ax,ax ;return value of zero
jz putch_full ;indicates disk full
mov word ptr ds:[obfptr],offset outbuf
clc
ret
putch_full: ;disk is full
mov dx,offset disk_full
stc ;exit with error
ret
write_err: ;error occured during write
;Page 47
mov dx,offset write_err_msg
stc ;return with error
ret
write_err_msg db 'Write error', cr, lf, 0
disk_full db 'Disk full', cr, lf, 0
write_buffer endp
;
;************************************************************************
;* *
;* get_arg - Returns the address of the next command line argument in *
;* DX. The argument is in the form of an ASCIIZ string. *
;* Returns Carry = 1 if no more command line arguments. *
;* Upon exit, if Carry is set, the contents of DX is undefined. *
;* *
;************************************************************************
;
get_arg proc
mov si,word ptr [cmd_ptr]
skip_space: ;scan over leading spaces and commas
lodsb
cmp al,0 ;if we get a null
jz sk0
cmp al,cr ;or a CR,
jnz sk1
sk0: stc ;set carry to indicate failure
ret ;and exit
sk1: cmp al,space
jz skip_space ;loop until no more spaces
cmp al,','
jz skip_space ;or commas
cmp al,tab
jz skip_space ;or tabs
mov dx,si ;start of argument
dec dx
get_arg1:
lodsb ;get next character
cmp al,cr ;argument seperators are CR,
jz get_arg2
cmp al,space ;space,
jz get_arg2
cmp al,',' ;comma,
jz get_arg2
cmp al,tab ;and tab
jnz get_arg1
get_arg2:
mov byte ptr ds:[si-1], 0 ;delimit argument with 0
cmp al, cr ;if char is CR then we've reached
jnz ga2 ; the end of the argument list
dec si
ga2: mov word ptr ds:[cmd_ptr], si ;save for next time 'round
ret ;and return
get_arg endp
;Page 48
endcode equ $
cseg ends
end start
~
;Page 49
|