Line number
In computing, a line number is a method used to specify a particular sequence of characters in a text file. The most common method of assigning numbers to lines is to assign every line a unique number, starting at 1 for the first line, and incrementing by 1 for each successive line.
In the C programming language the line number of a source code line is one greater than the number of new-line characters read or introduced up to that point.[1]
Programmers could also assign line numbers to statements in older programming languages, such as Fortran, JOSS, and BASIC. In Fortran, not every statement needed a line number, and line numbers did not have to be in sequential order. The purpose of line numbers was for branching and for reference by formatting statements.
Both JOSS and BASIC made line numbers a required element of syntax. The primary reason for this is that most operating systems at the time lacked interactive text editors; since the programmer's interface was usually limited to a line editor, line numbers provided a mechanism by which specific lines in the source code could be referenced for editing, and by which the programmer could insert a new line at a specific point. Line numbers also provided a convenient means of distinguishing between code to be entered into the program and direct mode commands to be executed immediately when entered by the user (which do not have line numbers).
Largely due to the prevalence of interactive text editing in modern operating systems, line numbers are not a feature of most programming languages, even modern Fortran and Basic.[2]
History
FORTRAN
In Fortran, as first specified in 1956, line numbers were used to define input/output patterns, to specify statements to be repeated, and for conditional branching. For example:[3]
DIMENSION ALPHA(25), RHO(25)
1) FORMAT(5F12.4)
2) READ 1, ALPHA, RHO, ARG
SUM = 0.0
DO 3 I=1, 25
IF (ARG-ALPHA(I)) 4,3,3
3) SUM = SUM + ALPHA(I)
4) VALUE = 3.14159*RHO(I-1)
PRINT 1, ARG, SUM, VALUE
GO TO 2
Like assembler language before it, Fortran did not assume every line needed a label (line number, in this case). Only statements referenced elsewhere required a line number:
- Line 1 specifies a format pattern for input; the
READ
command in line 2 and the laterPRINT
command both reference this line. - The
DO
loop executes line 3. - The arithmetic IF statement branches to line 4 on a negative value, line 3 on zero, and again line 3 on a positive value.
While the line numbers are sequential in this example, in the very first "complete but simple [Fortran] program" published the line numbers are in the sequence 1, 5, 30, 10, 20, 2.[4]
Line numbers could also be assigned to fixed-point variables (e.g., ASSIGN
i TO
n) for referencing in subsequent assigned GO TO statements (e.g., GO TO
n,(n1,n2,...nm)).
COBOL
In COBOL, line numbers were specified in the first six characters (the sequence number area) of punched cards. This was originally used for facilitating mechanical card sorting to assure intended program code sequence after manual handling. The line numbers were actually ignored by the compiler.
DOPE
In 1962, DOPE (Dartmouth Oversimplified Programming Experiment) became one of the first programming languages to require a line number for every statement and to use sequential ordering of line numbers. Line numbers were specified as destinations for two commands, C (Compare operation, an arithmetic IF) and T (To operation, a GO TO).
JOSS
In 1963, JOSS independently made line numbers mandatory for every statement in a program and ordered lines in sequential order. JOSS introduced the idea of a single command line editor that worked both as an interactive language and a program editor. Commands that were typed without a line number were executed immediately, in what JOSS referred to as "direct mode". If the same line was prefixed with a line number, it was instead copied into the program code storage area, which JOSS called "indirect mode".
Unlike FORTRAN before it or BASIC after it, JOSS required line numbers to be fixed-point numbers consisting of a pair of two-digit integers separated by a period (e.g., 1.1). The portion of the line number to the left of the period is known as the "page" or "part", while the portion to the right is known as the "line"; for example, the line number 10.12
refers to page 10, line 12. Branches can target either a page or a line within a page. When the later format is used, the combined page and line is known as a "step".
Pages are used to define subroutines, which return when the next line is on a different page. For instance, if a subroutine for calculating the square root of a number is in page 3, one might have three lines of code 3.1, 3.2 and 3.3, and it would be called using Do part 3.
The code would return to the statement after the Do when it reaches the next line on a different page, for instance, 4.1. There is no need for the equivalent of a RETURN
at the end, although if an early return is required, Done
accomplishes this. Example:
*Routine to ask the user for a positive value and repeat until it gets one 01.10 Demand X as "Enter a positive value greater than zero". 01.20 Done if X>0. 01.30 To step 1.1
BASIC
Introduced in 1964, Dartmouth BASIC adopted mandatory line numbers, as in JOSS, but made them integers, as in FORTRAN. As defined initially, BASIC only used line numbers for GOTO
and GOSUB
(go to subroutine, then return). Some Tiny BASIC implementations supported numeric expressions instead of constants, while switch statements were present in different dialects (ON
GOTO
; ON
GOSUB
; ON ERROR GOTO
).
Line numbers were rarely used elsewhere. One exception was allowing the pointer used by READ
(which iterated through DATA
statements) to be set to a specific line number using RESTORE
.
1 REM RESTORE COULD BE USED IF A BASIC LACKED STRING ARRAYS
2 DIM M$(9): REM DEFINE LENGTH OF 9 CHARACTERS
5 INPUT "MONTH #?"; M: IF M<1 OR M>12 THEN 5
7 RESTORE 10*M: READ M$: PRINT M$
10 DATA "JANUARY"
20 DATA "FEBRUARY"
30 DATA "MARCH"
...
In the first editions of Dartmouth BASIC, THEN
could only be followed by a line number (for an implied GOTO), not - as in later implementations - by a statement.
The range of valid line numbers varied widely from implementation to implementation, depending on the representation used to store the binary equivalent of the line number (one or two bytes; signed or unsigned). While Dartmouth BASIC supported 1 to 99999, the typical microcomputer implementation supported 1 to 32767 (a signed 16-bit word).
Range | Dialect |
---|---|
1 to 254 | MINOL |
1 to 255 | Tiny BASIC Design Note |
2 to 255 | Denver Tiny BASIC |
0 to 999 | UIUC BASIC |
1 to 2045 | DEC BASIC-8 |
0 to 32767 | LLL BASIC, NIBL |
1 to 32767 | Apple I BASIC, Level I BASIC, Palo Alto Tiny BASIC |
0 to 65529 | GW-BASIC, IBM BASIC |
1 to 65535 | Altair 4K BASIC, MICRO BASIC 1.3, 6800 Tiny BASIC, Tiny BASIC Extended |
1 to 99999 | Dartmouth BASIC |
1 to 999999 | SCELBAL |
0 to 1*10^40-1 | QBASIC 1) |
1) While QBASIC does make use of structured programming and thus doesn't need line numbers, it is still possible to run code with line numbers in QBASIC.
Line numbers and style
It was a matter of programming style, if not outright necessity, in these languages to leave gaps between successive line numbers—i.e., a programmer would use the sequence (10, 20, 30, ...) rather than (1, 2, 3, ...). This permitted the programmer to insert a line of code at a later time. For example, if a line of code between lines 20 and 30 was left out, the programmer might insert the forgotten line at line number 25. If no gaps were left in the numbering, the programmer would be required to renumber line 3 and all subsequent lines in order to insert the new line after line 2. Of course, if the programmer needed to insert more than nine additional lines, renumbering would be required even with the sparser numbering. However, this renumbering would be limited to renumbering only 1 line per ten lines added; when the programmer finds they need to add a line between 29 and 30, only line 30 would need to be renumbered and line 40 could be left unchanged.
Some BASICs had a RENUM command, which typically would go through the program (or a specified portion of it), reassigning line numbers in equal increments. It would also renumber all references to those line numbers so they would continue to work properly.
In a large program containing subroutines, each subroutine would usually start at a line number sufficiently large to leave room for expansion of the main program (and previous subroutines). For example, subroutines might begin at lines 10000, 20000, 30000, etc.
Line numbers and GOTOs
In "unstructured" programming languages such as BASIC, line numbers were used to specify the targets of branching statements. For example:
1 S=0: N=-1
2 INPUT "ENTER A NUMBER TO ADD, OR 0 TO END"; I
3 S=S+I: N=N+1: IF I<>0 THEN GOTO 2
4 PRINT "SUM="; S: PRINT "AVERAGE="; S/N
GOTO-style branching can lead to the development of spaghetti code. (See Considered harmful, Structured programming.) Even in some later versions of BASIC that still mandated line numbers, the use of line number-controlled GOTOs was phased out whenever possible in favor of cleaner constructs such as the for loop and while loop.
Many modern languages (including C and C++) include a version of the GOTO statement; however, in these languages the target of a GOTO is specified by a line label instead of a line number.
Line numbers and syntax errors
If a programmer introduces a syntax error into a program, the compiler (or interpreter) will inform the programmer that the attempt to compile (or execute) failed at the given line number. This simplifies the job of finding the error immensely for the programmer.
The use of line numbers to describe the location of errors remains standard in modern programming tools, even though line numbers are never required to be manually specified. It is a simple matter for a program to count the newlines in a source file and display an automatically generated line number as the location of the error. In IDEs such as Microsoft Visual Studio, Eclipse or Xcode, in which the compiler is usually integrated with the text editor, the programmer can even double-click on an error and be taken directly to the line containing that error.
See also
References
- ^ "6.10.4 Line control". 2008-01-30. Archived from the original on 2011-07-08. Retrieved 2008-07-03.
- ^ "Differences Between GW-BASIC and QBasic". 2003-05-12. Retrieved 2008-06-28.
- ^ Programming Research Department, International Business Machines Corporation (April 8, 1957). The FORTRAN Automatic Coding System for the IBM 704 EDPM: Preliminary Operator's Manual (PDF). pp. 6–37.
{cite book}
:|last1=
has generic name (help) - ^ Backus, John Warner; Beeber, R. J.; Best, Sheldon F.; Goldberg, Richard; Herrick, Harlan L.; Hughes, R. A.; Mitchell, L. B.; Nelson, Robert A.; Nutt, Roy; Sayre, David; Sheridan, Peter B.; Stern, Harold; Ziller, Irving (1956-10-15). Sayre, David (ed.). The FORTRAN Automatic Coding System for the IBM 704 EDPM: Programmer's Reference Manual (PDF). New York, USA: Applied Science Division and Programming Research Department, International Business Machines Corporation. p. 46. Archived (PDF) from the original on 2022-07-04. Retrieved 2022-07-04. (2+51+1 pages)