History:
Early developments:
Ken Thompson (left) with Dennis
Ritchie (right, the inventor of the C programming language)
The origin of C is closely tied to
the development of the Unix operating system, originally implemented in
assembly language on a PDP-7 by Ritchie and Thompson, incorporating several
ideas from colleagues. Eventually, they decided to port the operating system to
a PDP-11. The original PDP-11 version of Unix was developed in assembly
language. The developers were considering rewriting the system using the B
language, Thompson's simplified version of BCPL.[9] However B's inability to
take advantage of some of the PDP-11's features, notably byte addressability,
led to C. The name of C was chosen simply as the next after B.[10]
The development of C started in
1972 on the PDP-11 Unix system[11] and first appeared in Version 2 Unix.[12]
The language was not initially designed with portability in mind, but soon ran
on different platforms as well: a compiler for the Honeywell 6000 was written
within the first year of C's history, while an IBM System/370 port followed
soon.[1][11]
Also in 1972, a large part of Unix was
rewritten in C.[13] By 1973, with the addition of struct types, the C language
had become powerful enough that most of the Unix's kernel was now in C.
Unix was one of the first operating
system kernels implemented in a language other than assembly. Earlier instances
include the Multics system which was written in PL/I), and Master Control
Program (MCP) for the Burroughs B5000 written in ALGOL in 1961. In around 1977,
Ritchie and Stephen C. Johnson made further changes to the language to
facilitate portability of the Unix operating system. Johnson's Portable C
Compiler served as the basis for several implementations of C on new
platforms.[11]
K&R C:
The cover of the book, The C Programming Language, first
edition by Brian Kernighan and Dennis Ritchie
In 1978, Brian Kernighan and Dennis Ritchie published the
first edition of The C Programming Language.[1] This book, known to C
programmers as "K&R", served for many years as an informal
specification of the language. The version of C that it describes is commonly
referred to as K&R C. The second edition of the book[14] covers the later
ANSI C standard, described below.
K&R introduced several language features:
Standard I/O library
long int data type
unsigned int data type
Compound assignment operators of
the form =op (such as =-) were changed to the form op= (that is, -=) to remove
the semantic ambiguity created by constructs such as i=-10, which had been
interpreted as i =- 10 (decrement i by 10) instead of the possibly intended i =
-10 (let i be -10).
Even after the publication of the
1989 ANSI standard, for many years K&R C was still considered the
"lowest common denominator" to which C programmers restricted
themselves when maximum portability was desired, since many older compilers were
still in use, and because carefully written K&R C code can be legal
Standard C as well.
In early versions of C, only
functions that return types other than int must be declared if used before the
function definition; functions used without prior declaration were presumed to
return type int.
For example:
long some_function();
/* int */ other_function();
/* int */ calling_function()
{
long
test1;
register
/* int */ test2;
test1 =
some_function();
if (test1
> 0)
test2 = 0;
else
test2 = other_function();
return
test2;
}
The int type specifiers which are
commented out could be omitted in K&R C, but are required in later
standards.
Since K&R function declarations
did not include any information about function arguments, function parameter
type checks were not performed, although some compilers would issue a warning
message if a local function was called with the wrong number of arguments, or
if multiple calls to an external function used different numbers or types of arguments.
Separate tools such as Unix's lint utility were developed that (among other
things) could check for consistency of function use across multiple source
files.
In the years following the
publication of K&R C, several features were added to the language,
supported by compilers from AT&T (in particular PCC[15]) and some other
vendors. These included:
void functions (i.e., functions
with no return value)
functions returning struct or union
types (rather than pointers)
assignment for struct data types
enumerated types
The large number of extensions and
lack of agreement on a standard library, together with the language popularity
and the fact that not even the Unix compilers precisely implemented the K&R
specification, led to the necessity of standardization.
ANSI C and ISO C:
Main article: ANSI C
The cover of the book, The C
Programming Language, second edition by Brian Kernighan and Dennis Ritchie
covering ANSI C
During the late 1970s and 1980s,
versions of C were implemented for a wide variety of mainframe computers,
minicomputers, and microcomputers, including the IBM PC, as its popularity
began to increase significantly.
In 1983, the American National
Standards Institute (ANSI) formed a committee, X3J11, to establish a standard
specification of C. X3J11 based the C standard on the Unix implementation;
however, the non-portable portion of the Unix C library was handed off to the
IEEE working group 1003 to become the basis for the 1988 POSIX standard. In
1989, the C standard was ratified as ANSI X3.159-1989 "Programming
Language C". This version of the language is often referred to as ANSI C,
Standard C, or sometimes C89.
In 1990, the ANSI C standard (with
formatting changes) was adopted by the International Organization for
Standardization (ISO) as ISO/IEC 9899:1990, which is sometimes called C90.
Therefore, the terms "C89" and "C90" refer to the same programming
language.
ANSI, like other national standards
bodies, no longer develops the C standard independently, but defers to the
international C standard, maintained by the working group ISO/IEC
JTC1/SC22/WG14. National adoption of an update to the international standard typically
occurs within a year of ISO publication.
One of the aims of the C
standardization process was to produce a superset of K&R C, incorporating
many of the subsequently introduced unofficial features. The standards
committee also included several additional features such as function prototypes
(borrowed from C++), void pointers, support for international character sets
and locales, and preprocessor enhancements. Although the syntax for parameter
declarations was augmented to include the style used in C++, the K&R
interface continued to be permitted, for compatibility with existing source
code.
C89 is supported by current C
compilers, and most C code being written today is based on it. Any program
written only in Standard C and without any hardware-dependent assumptions will
run correctly on any platform with a conforming C implementation, within its
resource limits. Without such precautions, programs may compile only on a
certain platform or with a particular compiler, due, for example, to the use of
non-standard libraries, such as GUI libraries, or to a reliance on compiler- or
platform-specific attributes such as the exact size of data types and byte
endianness.
In cases where code must be
compilable by either standard-conforming or K&R C-based compilers, the
__STDC__ macro can be used to split the code into Standard and K&R sections
to prevent the use on a K&R C-based compiler of features available only in
Standard C.
After the ANSI/ISO standardization
process, the C language specification remained relatively static for several
years. In 1995, Normative Amendment 1 to the 1990 C standard (ISO/IEC
9899/AMD1:1995, known informally as C95) was published, to correct some details
and to add more extensive support for international character sets.[citation needed]
C99:
Main article: C99
The C standard was further revised
in the late 1990s, leading to the publication of ISO/IEC 9899:1999 in 1999,
which is commonly referred to as "C99". It has since been amended
three times by Technical Corrigenda.[16]
C99 introduced several new
features, including inline functions, several new data types (including long
long int and a complex type to represent complex numbers), variable-length
arrays and flexible array members, improved support for IEEE 754 floating point,
support for variadic macros (macros of variable arity), and support for
one-line comments beginning with //, as in BCPL or C++. Many of these had
already been implemented as extensions in several C compilers.
C99 is for the most part backward
compatible with C90, but is stricter in some ways; in particular, a declaration
that lacks a type specifier no longer has int implicitly assumed. A standard
macro __STDC_VERSION__ is defined with value 199901L to indicate that C99
support is available. GCC, Solaris Studio, and other C compilers now support
many or all of the new features of C99. The C compiler in Microsoft Visual C++,
however, implements the C89 standard and those parts of C99 that are required
for compatibility with C++11.[17]
C11:
Main article: C11 (C standard revision)
In 2007, work began on another
revision of the C standard, informally called "C1X" until its
official publication on 2011-12-08. The C standards committee adopted
guidelines to limit the adoption of new features that had not been tested by
existing implementations.
The C11 standard adds numerous new
features to C and the library, including type generic macros, anonymous
structures, improved Unicode support, atomic operations, multi-threading, and
bounds-checked functions. It also makes some portions of the existing C99
library optional, and improves compatibility with C++. The standard macro
__STDC_VERSION__ is defined as 201112L to indicate that C11 support is
available.
Embedded C:
Main article: Embedded C
Historically, embedded C
programming requires nonstandard extensions to the C language in order to
support exotic features such as fixed-point arithmetic, multiple distinct
memory banks, and basic I/O operations.
In 2008, the C Standards Committee
published a technical report extending the C language[18] to address these
issues by providing a common standard for all implementations to adhere to. It
includes a number of features not available in normal C, such as fixed-point
arithmetic, named address spaces, and basic I/O hardware addressing.
Syntax
Main article: C syntax
C has a formal grammar specified by
the C standard.[19] Line endings are generally not significant in C; however,
line boundaries do have significance during the preprocessing phase. Comments
may appear either between the delimiters /* and */, or (since C99) following //
until the end of the line. Comments delimited by /* and */ do not nest, and
these sequences of characters are not interpreted as comment delimiters if they
appear inside string or character literals.[20]
C source files contain declarations
and function definitions. Function definitions, in turn, contain declarations
and statements. Declarations either define new types using keywords such as
struct, union, and enum, or assign types to and perhaps reserve storage for new
variables, usually by writing the type followed by the variable name. Keywords
such as char and int specify built-in types. Sections of code are enclosed in
braces ({ and }, sometimes called "curly brackets") to limit the
scope of declarations and to act as a single statement for control structures.
As an imperative language, C uses
statements to specify actions. The most common statement is an expression
statement, consisting of an expression to be evaluated, followed by a
semicolon; as a side effect of the evaluation, functions may be called and
variables may be assigned new values. To modify the normal sequential execution
of statements, C provides several control-flow statements identified by reserved
keywords. Structured programming is supported by if(-else) conditional
execution and by do-while, while, and for iterative execution (looping). The
for statement has separate initialization, testing, and reinitialization
expressions, any or all of which can be omitted. break and continue can be used
to leave the innermost enclosing loop statement or skip to its
reinitialization. There is also a non-structured goto statement which branches
directly to the designated label within the function. switch selects a case to
be executed based on the value of an integer expression.
Expressions can use a variety of
built-in operators and may contain function calls. The order in which arguments
to functions and operands to most operators are evaluated is unspecified. The
evaluations may even be interleaved. However, all side effects (including
storage to variables) will occur before the next "sequence point";
sequence points include the end of each expression statement, and the entry to
and return from each function call. Sequence points also occur during
evaluation of expressions containing certain operators (&&, ||, ?: and
the comma operator). This permits a high degree of object code optimization by
the compiler, but requires C programmers to take more care to obtain reliable
results than is needed for other programming languages.
Kernighan and Ritchie say in the
Introduction of The C Programming Language: "C, like any other language,
has its blemishes. Some of the operators have the wrong precedence; some parts
of the syntax could be better."[21] The C standard did not attempt to
correct many of these blemishes, because of the impact of such changes on
already existing software.
Character set:
The basic C source character set includes the
following characters:
Lowercase and uppercase letters of
ISO Basic Latin Alphabet: a–z A–Z
Decimal digits: 0–9
Graphic characters: ! " # %
& ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } ~
Whitespace characters: space,
horizontal tab, vertical tab, form feed, newline
Newline indicates the end of a text
line; it need not correspond to an actual single character, although for
convenience C treats it as one.
Additional multi-byte encoded
characters may be used in string literals, but they are not entirely portable.
The latest C standard (C11) allows multi-national Unicode characters to be
embedded portably within C source text by using \uXXXX or \UXXXXXXXX encoding
(where the X denotes a hexadecimal character), although this feature is not yet
widely implemented.
The basic C execution character set
contains the same characters, along with representations for alert, backspace,
and carriage return. Run-time support for extended character sets has increased
with each revision of the C standard.
No comments:
Post a Comment