K&R - Chapter 1 - Basic Syntax

The C Programming Language by Kernighan and Ritchie

Chapter 1-4 - Mostly syntax, Arrays, Strings (Character arrays)
Chapter 5 - Pointers and Arrays
Chapter 6 - Structures
Chapter 7-8 - Detailed C features


Character Arrays

We must carefully understand the ‘size’ of the character array and not exceed it. In C nothing is ‘auto extended’.

x = ""
for i in range(1000):
	x += "*"
print (x)

This will not cause any problem as memory allocation is flexible.

#include <stdio.h>
int main() {
	char x[10];
	int i;

	for( i=0; i<1000; i++) x[i] = '*';
	printf("%s\n", x);
}

$ a.out

Segmentation falult: 11

The size of the string has been exceeded.
This is the reason why C is not used to write programs.

“Buffer Overrun Errors”
90% all security holes are due to C code.

String / Character Constants

In C single quotes '' are a character and double quotes "" are a character array (neither are string) with a 0 character at the end of it.
A "" with one character in it is actually 2 bytes.

A Character is a byte - a short(8-bit) integer.

#include <stdio.h>
int main() {
	char x[3] = "Hi";
	char y[3] = { 'H', 'i'};

	printf("x %s\n, x");
	printf("y %s\n, y");

	printf("%s\n", "Hi");
	printf("%c%c\n", 'H', 'i');
}
$ a.out

x Hi
y Hi
Hi
Hi

Character Sets

The C char type is just a number (8-bits long) usually ASCII.
Modern characters include multi-byte sequences using Unicode and UTF-8

#include <stdio.h>
int main() {
	print("%c %d\n", 'A', 'A');
}
$ a.out

A 65

A character is more similar to an int than to a string.

Terminating a String

#include <stdio.h>
int main() {
	char x[6];
	x[0] = 'H';
	x[1] = 'e';
	x[2] = 'l';
	x[3] = 'l';
	x[4] = 'o';
	x[5] = '\0';
	
	printf("%s\n", x);
	
	x[2] = 'L';
	printf("%s\n", x);
	
	x[3] = '\0';
	printf("%s\n", x);
}
$ a.out

Hello
HeLlo
HeL

There are no strings, they are “arrays of characters”, there is no length.
The size of the “string” stored in C array is not the length of the array.
C Uses a special character \0 that marks the string end by convention.
So character arrays need to allocate extra byte to store the line end character.

Terminating a string is very important to think before creating a new string and scanning through a string, if something is appended to a “character array” then the end character has to be moved.

Manipulation: String manipulation in C involves careful management of the null terminator, where the null terminator is moved or altered.

char x[6];
x[0] = 'H'; x[1] = 'e'; x[2] = 'l'; x[3] = 'l'; x[4] = 'o'; x[5] = '\0';
printf("%s\n", x);  // prints "Hello"

String length

In C string “length” must be computed in a loop that scans for a zero character.
There the strlen() function in string.h computes string length.

x = 'Hello'
print(x, len(x))   
# Hello 5 

in python x is an object and len is an attribute of that object.

#include <stdio.h>
int main() {
	char x[] = "Hello";
	int py_len();
	
	printf("%s %d\n", x, py_len(x));
}

int py_len(self)
	char self();
{
	int i;
	for(i=0; self[i]; i++);    
	/* when string gets over it turns false */
	return i;
}
// a.out
// Hello 5
int py_len(char self[]) {
    int i;
    for (i = 0; self[i]; i++);
    return i;
}

Reverse a String in place in C

Exercise 1-19 in K&R
Reversing a string in place involves swapping characters from the start and end of the string until the middle is reached.

#include <stdio.h>
int main() {
	char x[]= "Hello";
	char reverse();
	printf("%s "  )
}

Chapter 1

1.1 Getting started

int main() {}
printf()
\n is the only way of adding a new line.
\t for tab, \b for backspace, \" for double quotes, \\ for back slash itself.

1.2 Variables and Arithmetic

Comments
Declaring variables
(when there is an error a Diagnostic message containing type and list of variables will be shown.)
Int and float have size difference. 16bit signed number, 32 bit quantity with 7 significant bits.

Other basic data types,

int
float 
char - character, a single byte, 
short - short integer, 
long - long integer,
double - double-precision floating point

assignment operator to assign values =
terminating statements using ;

#include <stdio.h>
/* Print Farenheit-Celsius table for 
f =0, 20, ..., 300 */

int main() {
	int lower, upper, step;
	float fahr, celsius;

	lower = 0;    // lower limit of the temperature
	upper = 300;  // uppr limit
	step = 20;    // step size
	
	fahr = lower;
	
	while (fahr <= upper) {
		celsius = (5.0/9.0) * (fahr-32.0);
		printf("%4.0f %6.1f\n", fahr, celsius);
		fahr = fahr + step;
	}
}

while loop while (fahr<= upper) {...}
Indentation and white space is for readability, any position is permissible.

Using (5.0/9.0) instead of 5/9 to prevent truncating of numbers and additional numbers are discarded. i.e 5/9 will be 0 which means everything will be zero.

printf() is a general purpose format conversion function. It is not part of C, but the standard library.
printf("%4.0f %6.1f\n", fahr, celsius);
%4.0f states that a floating point number is to be printed in a space at least four character wide, with no digits after the decimal point.
%6.1f describes a floating point number in 6 character space, with one digit after the decimal.

Parts of a specification may be omitted, %6f at least six characters wide.
%.2f requests two place after the decimal place, but width is not constrained.
%f says print the number as a floating point number.

printf also recognizes %d for decimal integer, %o for octal, %x for hexadecimal, %c for character, %s for character string and %% for % itself.

Each % constraint in first argument should pair with its corresponding second, third arguments, they must line up properly by number and type.

If you have to input numbers, then consider function scanf which reads input instead of writing output like printf

1.3 The For Statement

for (initialization; condition; increment) {
    // loop body
}
#include <stdio.h>

main() {
	int fahr;

	for (fahr = 0; fahr <=300; fahr = fahr + 20)
		printf("%4d %6.1f\n", fahr, (5.0/9.0)*(fahr-32));
}

First part is done once, second part is the condition that is checked each iteration, and last is re-initialization step.

While and for loops are in-determinant loops structure because they must be read closely to make sure they are properly constructed and not unintentionally a “infinite loop”.

for loop in python and foreach in PHP are determinant loops. They iterate over all of the elements in a collection which is not finite.

1.4 Symbolic Constants

To avoid magic numbers like 300, 20 which are buried inside the code which might not convey any information while reading as to what they are.
With #define construction, at the beginning of the program a symbolic name or symbolic constant to be a particular string of characters.
The compiler will replace the unquoted occurrences of the name by corresponding string.

#include <stdio.h>

#define LOWER 0
#define UPPER 300
#define STEP 20

main()
{
	int fahr;
	
	for (fahr = LOWER; fahr <= UPPER; fahr = fahr+STEP)
		printf("%4d %6.1f\n", fahr, (5.0/9.0)*(fahr-32));
}

The LOWER, UPPER, STEP are constants so they do not appear in declarations.
To separate them from lower case variable names they are made Fully upper.
There are no ; after the definition because the whole line after the define will be copied, so to avoid too many semicolons in the for.

1.5 Collection of Useful Programs

Family of related programs for doing simple operations in character data.

getchar() and putchar() which are provided by the library.

File Copying

#include <stdio.h>

main()    /* copy input to output*/
{
	int c;
	c = getchar();
	while (c != EOF) {
		putchar(c);
		c = getchar();
		}
}

For the end of file is, the common convention is -1 when the program has run out of input. The symbolic name EOF is a symbolic name. (The EOF is defined in stdio.h so should never be defined in code.)

#include <stdio.h>
main() {
	int c;
	while ( (c=getchar())  !=EOF )
		putchar(c);
}

The program gets a character, assigns it to c and tests whether the character was the end of file signal. If it was not, the body of the while is executed, printing the character.
When end input is reached, while terminates.

Character Counting

#include <stdio.h>

main()
{
	long nc;
	
	nc = 0;
	while (getchar() != EOF)
		++nc;
	printf("%ld\n", nc);
}

++nc means increment by one. also --nc
similar to nc = nc + 1
Prefix operators ++nc and postfix operators nc++ both increment but have different values in expressions.
%ld signals that corresponding argument is a long integer.

To cope with even bigger numbers double(double length float) can be used.

#include <stdio.h>
main()
{
	double nc;
	
	for (nc = 0; getchar() != EOF; ++nc)
		;
	printf("%.0f\n, nc");
}

; in the middle is an empty statement to show there is nothing in the body of for loop. but grammatically it should have a body.

%.0f suppresses printing of the non-existent fraction part.

Line Counting Input lines are assumed to be terminated by the newline character \n

#include <stdio.h>

main() {
	int c, nl;
	
	nl = 0;
	while ( (c = getchar()) != EOF)
		if (c == '\n')
			++nl;
	printf("%d\n", nl);
}

If statement inside the while controls the increment if line is found.
Any character written between a '' to produce a value equal to numerical value of the character.
'A' is 65

'\n' is a single character and is equivalent to a single integer,
on the other hand "\n" is a character string which happens only one character.

Word Counting A loose definition of word that does not contain blank, tab or newline.

#include <stdio.h>

#define YES 1
#define NO 0

main() {  /* count lines, words, characters in input*/
	int c, nl, nw, nc, inword;

	inword = NO;
	nl = nw = nc = 0;
	while ( (c=getchar()) != EOF ) {
		++nc;
		if (c == '\n')
			++nl;
		if (c == ' ' || c == '\n' || c == '\t')
			inword = NO;
		else if ( inword == NO ) {
			inword = YES;
			++nw;
		}
	}
	printf("%d %d %d\n", nl, nw, nc);
}

The variable inword records if the program is in a word or not, initially ’not in a word’ .

The else is an alternative action to be done if the condition part of if is false.

if (epression)
	statement-1
else
	statement-2

One and only one of the two statements associated with if-else is done, not both.

1.6 Arrays

The number of elements in an array declaration must be constant at compile time, and the size of the array cannot be adjusted using an array declaration while program is running.

This leads to security flaws referred to as “buffer overflow” where a program reads mode data than it can fit into an array where it may overwrite the data or compromise the application.

#include <stdio.h>

main()   /*count digits, white space and other*/
{
	int c, i, nwhite, nother;
	int ndigit[10];
	
	nwhite = nother = 0;
	for (i=0; i<10; ++i)
		ndigit[i] = 0;
	
	while ( (c = getchar()) != EOF )
		if (c >= '0' && c<= '9')
			++ndigit[c-'0'];
		else if ( c == '' || c == '\n' || c == '\t')
			++nwhite;
		else
			++nother;
	printf("digits =");
	
	for (i=0; i<10; ++i)
		printf(" %d", ndigit[i]);
	
	printf("\nwhite space = %d, other = %d\n", nwhite, nother);
}

int ndigit[10]; is an array of 10 integers.
if (c >= '0' && c<= '9') checks if the character in c is a digit.
If it is, then c-'0' is the digit.

By definition, arithmetic involving char and int converts everything to int before proceeding. so c-'0' is an integer expression.

1.7 Functions

In C, a function is equivalent to a subroutine or function, Encapsulate in a back box.

#include <stdio.h>

main() 
{
	int i;
	
	for (i = 0; i<10; ++i)
		printf("%d %d %d\n", i, power(2,i), power(-3,i));
}

power(x, n)  /* raise x to nth power;  n>0 */
int x, n;
{
	int i, p;

	p = 1;
	for (i = 1; i<=n; ++i)
		p = p * x;
	return (p);
}

Each function has the same form:

name (argument list, if any)
argument declarations, if any
    declarations
    statements

1.8 Arguments - Call by Value

In C, function arguments are passed by value. This means the function is given the value of it’s arguments in temporary variables (on a stack) rather than on their address.
Passing ‘by value’ has become the norm after C as it doesn’t allow the called code to mess with the arguments and create side effects.

The call stack that made it possible to pass by value also made it possible for the function to call itself recursively.

In python, simple variables like integers and strings are passed by value while Structured data like Dict and list are passed by reference.

power(x, n) 
int x, n;
{
	int i, p;
	
	for (p = 1; n>0; --n)
		p = p*x;
	return (p);
}

The argument n is used as a temporary variable which counts down till it becomes 0. So no need of i.

1.9 Character Arrays

#include <stdio.h>

#define MAXLINE 1000   /* max input line size*/

main()   /* find longest line */
{
	int len;                /* current line length */
	int max;                /* Max length seen so far */
	char line[MAXLINE];     /* current input line */
	char save[MAXLINE];     /* longest line, saved */
	
	max = 0;
	while ((len = get_line(line, MAXLINE)) > 0 )
		if (len > max) {
			max = len;
			copy(line, save);
		}
	if (max > 0)    /* there was a line */
		printf("%s", save);
}

get_line(s, lim)    /* get line into s, return length */
char s[];
int lim;
{
	int c, i;
	
	for (i=0; i<lim-1 && (c=getchar()) != EOF && c!='\n'; ++i )
		s[i] = c;
	if (c == '\n') {
		s[i] = c;
		++i;
	}
	s[i] = '/0';
	return(i);
}

copy(s1, s2)    /* copy s1 to s2; assume s2 big enough */
char s1[], s2[];
{
	int i;
	i = 0;
	while ((s2[i] = s1[i]) != '\0' )
		++i;
}

1.10 Scope; External Variable

The variables in main(line, save) are private or local to main as they are declared within main. No other functions have direct access to them.
The variable in a routine comes to life only if the function is called and disappears when the function exists.

Global variables which are declared outside can be accessed by any function. They retain their value as they do not disappear.

The variable must also be declared in each function that wants to access it. this maybe done wither by explicit extern declaration or implicitly by context.

#include <stdio.h>

#define MAXLINE 1000   /* max input line size*/

char line[MAXLINE];     /* current input line */
char save[MAXLINE];     /* longest line, saved */
int max;                /* Max length seen so far */


main()   /* find longest line */
{
	int len;                /* current line length */
	extern int max;
	extern char save[];		
	max = 0;
	
	while ((len = get_line()) > 0 )
		if (len > max) {
			max = len;
			copy();
		}
	if (max > 0)    /* there was a line */
		printf("%s", save);
}

get_line()    /* get line into s, return length */
{
	int c, i;
	extern char line[];
	
	for (i=0; i<MAXLINE-1 && (c=getchar()) != EOF && c!='\n'; ++i )
		line[i] = c;
	if (c == '\n') {
		line[i] = c;
		++i;
	}
	line[i] = '\0';
	return(i);
}

copy()    /* copy s1 to s2; assume s2 big enough */
{
	int i;
	extern char line[], save[];
	
	i = 0;
	while ((save[i] = line[i]) != '\0' )
		++i;
}

The external variables are there even when you don’t want them.