Table of content:

Understanding Tokens In C: Keywords, Identifiers, Operators & More

As we all know, the English language comprises varied types of words like nouns, adjectives, pronouns, articles, conjunctions, etc. These words make up the whole language and are used according to the grammar rules. Similarly, every programming language also has individual components that play varied roles in a program called tokens. In this article, we will discuss tokens in C, their significance, their types, and the role they play within the C program.

What Are Tokens In C Language?

Tokens in C are the individual units of source code that make up the C programming language. They are the smallest individual elements (or words or characters) that have a specific meaning and purpose in any programming language.

In other words, tokens are essential components or the building blocks of C programs and are processed by the C compiler to generate an executable program. There are 6 primary types/ classifications of tokens in C language, which we will discuss in later sections.

Use Of Tokens In C

Tokens in C include identifiers, keywords, constants, operators, and punctuation symbols. The C compiler reads the source code and breaks it down into these tokens, which it uses to understand the program's structure and functionality. Tokens serve as the foundation upon which C programs are built and executed.

Classification Of Tokens In C

While there are 6 primary types of tokens in C language, we do have other classifications. Listed below are all the different types of tokens in C:

Keywords (Reserved words with special meaning that are used for specific purposes only.)
Identifiers (Names used to identify variables, functions, classes, etc., components.)
Operators (Symbols used to perform manipulations and modifications on data.)
Literals
Special Symbols
Constants
Strings (A sequence of characters/ character array.)
Character Set

We will discuss each of these token types in the sections ahead with the help of proper examples. So let's get started.

Keyword Tokens In C

Keywords are reserved (or pre-defined) words in C that have predefined meanings and are used to show actions or operations in the code. They are an integral part of the C language syntax and cannot be used as identifiers (variable names or function names) in the code. There are a total of 32 keyword tokens in C language.

Some Examples of keywords in C are if, while, for, int, else, switch, and return. It is essential to use these keywords properly to stick to the syntax rules and ensure that the code is interpreted correctly by the compiler by using the preprocessor directives. The pre - processor is automatically used by the compiler and denotes that we are using a header file. Let's look at a simple C program example that illustrates how the if keyword from the if-statement.

Code Example:

Click here to view code

Output:

Positive number

Explanation:

In the simple C code example, we first include the input-output header file <stdio.h>.

Then, we begin the main() function, which is the entry point for the program's execution.
In main(), we declare a variable num of integer data type and assign the value 10 to it.
Next, we use an if-statement, which checks if the value of the num variable is greater than 0, i.e., num>0.
- If the condition is true, the if-block is executed. It contains a printf() statement, which displays the string message- Positive number to the console.
- If the condition is false, the flow moves to the next line in the program.
After the if-statement, the main() function terminates with a return 0 statement, indicating successful execution.

The if keyword is reserved for if-statements in a C program. Using this keyword any other way will lead to errors. See the table below, listing various other reserved keyword tokens in C and their significance.

Also Read: Keywords In C Language | Properties, Usage, Examples & Detailed Explanation!

Different Reserved Keyword Tokens In C

Keyword	Description
auto	Specifies automatic storage class
break	Terminates a loop or switch statement
case	Labels a statement within a switch
char	Declares a character variable
const	Specifies a constant value
continue	Skips an iteration of a loop
default	Specifies default statement in switch
do	Starts a do-while loop
Double	Declares a double-precision variable
Else	Executes when the if condition is false
enum	Declares an enumeration type
extern	Declares a variable or function
float	Declares a floating-point variable
for	Starts a for-loop
goto	Jumps to a labeled statement
if	Starts an if statement
int	Declares an integer variable
long	Declares a long integer variable
register	Specifies register storage class
return	Returns a value from a function
short	Declares a short integer variable
signed	Declares a signed variable
sizeof	Determines the size of a data type
static	Specifies static storage class
struct	Declares a structure type
switch	Starts a switch statement
typedef	Creates a new data type
union	Declares a union type
unsigned	Declares an unsigned variable
void	Specifies a void return type

Check out this amazing course to become the best version of the C programmer you can be.

Identifier Tokens In C

Identifier tokens In C programming are user-defined words for naming variables, functions in C, and other program elements. They are created by the programmer and should follow certain naming conventions and rules. For instance, identifiers should start with a letter or an underscore and can include letters, digits, and underscores.

Also, meaningful and descriptive identifiers help make the code more readable and maintainable. Choosing appropriate identifiers that reflect the purpose and usage of the associated individual elements in the code is crucial for writing high-quality C programs. The example C program below illustrates the use of this type of tokens in C.

Code Example:

Click here to view code

Output:

Age: 25
Salary: 5000.50

Explanation:

In the sample C code-

Inside the main() function, we initialize two variables, i.e., age (an integer with a value of 25) and salary (a floating-point number with a value of 5000.50).
The names- age and salary are identifiers that describe the type of value they store.
Next, we use the printf() function to display the values of age and salary to the console.
Here, the %d and %.2f format specifiers indicate integer and floating-point values. The newline escape sequence (\n) shifts the cursor to the next line.

Rules For Naming Identifier Tokens In C

Here are the rules one must follow for naming variables, functions, etc., identifier tokens in C:

Valid characters: Identifier tokens in C can consist of letters (both uppercase and lowercase), digits, and underscores. However, special characters or blank spaces are not allowed in identifiers. For example, myVar, count, and total_score are valid identifiers.
First letter: The first letter of an identifier token in C should be a letter or an underscore. That is, it cannot be a digit or any other special character. For example, _myVar and score are valid identifiers, but 9count or #score are invalid identifiers.
Avoid using keywords: As discussed above, some keywords have predefined meanings in C and cannot be used as identifiers, such as int, while, for, etc.
Length limit: There is no specific rule on how long an identifier token in C can be. However, some compilers may have limitations, and identifiers over 31 characters may cause issues. Hence, identifier tokens are recommended to be kept reasonably short and descriptive for readability and maintainability.
Case Convention: While C is case-sensitive, following certain conventions for readability is common. The most common convention is to use lowercase letters for variable and function names and uppercase letters for constants. For multi-word identifiers, you can use either lowercase letters with underscores or capitalize the first letter of each word (e.g. camelCase).

Also Read: Identifiers In C | Types, Rules For Definition, & More (+Examples)

Operator Tokens In C

Operators are symbols that are used to perform operations, manipulations and modifications on data. More specifically, they perform arithmetic, logical, relational, and other operations on variables and values in the code.

There are many types of operators in C, including arithmetic operators, logical operators, relational/ comparison operators, assignment operators, bitwise operators, increment/ decrement operators and conditional/ ternary operator.
Depending upon the number of operands an operator takes, they can be classified as unary or binary.
The correct use of operator tokens in C is essential for performing desired computations and manipulations on data in C programs.
Understanding their precedence and associativity rules is also important for writing correct and efficient code.

Look at the example expressions below, which illustrate the use of different types of operators in C, along with the explanation in the code comments. Say we have two variables, a and b, with values 5 and 10, respectively. Then, we can perform addition, multiplication, etc, operations on them by using corresponding operators.

// Example of operators in C
int a = 5, b = 10; //simple assignment operator
int sum = a + b; // addition operator
int product = a * b; // operator for multiplication
int isGreater = a > b; // greater than relational operator
a += 1; // compound assignment operator

Precedence & Associativity Rules For Operators In C

Operators in C have different levels of precedence, which determines the order in which they are evaluated in an expression, sort of like BODMAS in mathematical problems. The rules of precedence are:

Operators that have higher precedence are evaluated before operators with lower precedence.
For example, multiplication (*) and division (/) have higher precedence than addition (+) and subtraction (-). So, expressions with multiplication and division will be evaluated before addition and subtraction.
Parentheses can be used to explicitly specify the order of evaluation in an expression. Anything enclosed in parentheses is evaluated first, overriding the default precedence rules.

Associativity Rule For Operator Tokens In C

Associativity determines the order in which operators with the same precedence are evaluated when they appear in an expression.

Operators can be left-associative or right-associative. Left-associative operators are evaluated from left to right, while right-associative operators are evaluated from right to left.
For example, addition (+) and subtraction (-) are left-associative, so if they appear consecutively in an expression, they are evaluated from left to right.
The assignment operator (=) is right-associative, so if multiple assignment operators appear consecutively, they are evaluated from right to left.

Types Of Operator Tokens In C

The operator tokens in C can be classified into two types- unary operators and binary operators, depending on the number of operands they take.

Unary Operator Tokesn In C

A unary operator is an operator that operates on a single operand or variable in an expression. It performs an operation or transformation on the value of the operand and typically returns a new value. The unary operator tokens in C include negation (-), increment (++), decrement (--), logical NOT (!), and bitwise one's complement (~).

Unary operators are commonly used in programming to manipulate the value of a single variable or expression. The C program example below illustrates the use of the unary negation operator.

Code Example:

Click here to view code

Output:

y: -5

Binary Operators

A binary operator operates on two operands or variables in an expression. It takes two values, performs an operation on them, and returns a result. Binary operators are used to perform arithmetic, logical, and comparison operations.

Examples of binary operators include addition (+), subtraction (-), multiplication (*), division (/), and equality (==). Binary operators are commonly used in programming to perform calculations or compare two values. The C code example below illustrates the use of the addition binary operator token.

Code Example:

Click here to view code

Output:

The result of 5 + 3 is 8

Different Types Of Operator Tokens In C Language

Listed in the table below are the different types of operators along with the symbols and a short description.

Operator Type	Description	Example
Arithmetic Operators	Perform mathematical operations	`+, -, *, /, %`
Relational Operators	Compare values	`==, !=, >, <, >=, <=`
Logical Operators	Combine and manipulate boolean expressions	`&&,
Assignment Operators	Assign values to variables	`=, +=, -=, *=, /=, %=`
Increment Operator/ Decrement Operators	Increase or decrease variable values	`++, --`
Ternary/ Conditional Operator	Perform conditional operations	`?:`
Bitwise Operators	Manipulate individual bits	`&,
Membership Operators	Check if a value exists in a set	`in, not in`
Pointer Operators	Access and manipulate bytes of memory space and address.	`*, &, ->`
Miscellaneous/ Special Operators	The comma operator is used to separate expressions and elements.	`,`

These are some of the most commonly used types of operator tokens in C. Each type serves a different purpose and is used in various programming scenarios. It's important to understand the behavior and precedence of operators to write correct and efficient C code.

Literal Tokens In C

Literals are used to represent constant values or fixed values that remain unchanged during program execution. There are several types of literal tokens in C, including numeric literals (e.g., integers or floating-point numbers), literal characters (e.g., \n for newline), and string literals (e.g., Hello, World!).

Proper use of literals helps define the values of variables, constants, and other data elements in the code, and their accurate representation is crucial for achieving the desired behavior in code. Below is a sample C program that illustrates the creation of various literal tokens in C.

Code Example:

Click here to view code

Output:

Integer Literal: 42
Float Literal: 3.140000
Character Literal: A
Double Literal: 2.718280
String Literal: Hello

Explanation:

In the example C code-

In the main() function, we declare and initialize different types of literal tokens, including-
- integerLiteral with an integer literal value of 42.
- floatLiteral with a floating-point literal value of 3.14.
- charLiteral with a character literal value of 'A'.
- doubleLiteral with a double-precision floating-point literal value of 2.71828.
- stringLiteral with a string literal value of "Hello".
We then use a set of printf() statements to display the values of these literals with appropriate format specifiers.
Finally, the program returns 0, indicating successful execution, before exiting.

Types of Literals

Integer literals: Integer literals are used to represent integer values in C, which are whole numbers without any decimal point. They can be written with or without a sign and can be represented in different bases, such as decimal, hexadecimal, and octal. For example:

Decimal: 42, -10
Hexadecimal: 0xFF (which represents 255 in decimal)
Octal: 012 (which represents 10 in decimal)

Floating-point literals: Floating-point literals are used to represent floating-point values in C, which are numbers with a decimal point. They can be written with or without a sign and can have an optional exponent part. For example:

3.14, -0.5
2.0e-3 (which represents 0.002 in decimal)

Character literals: Character literals are used to represent individual characters in C, enclosed in single quotes (' '). They can be regular characters, escape sequences representing special characters (such as '\n' for a newline character), or octal or hexadecimal escape sequences representing characters by their ASCII values. For example:

'a', '1', '\n' (newline character), '\x41' (which represents 'A' in ASCII)

String literals: String literals are used to represent sequences of characters in C, enclosed in double quotes (" "). They are used to represent strings of characters, which are arrays of characters terminated by a null character ('\0'). For example:

"hello", "world"

Boolean literals: Boolean literals represent boolean values, which can be either true or false. In C, boolean values are typically represented using integer literals, where 0 represents false and any non-zero value represents true. For example:

0 (false), 1 (true)

Binary literals: Binary literals are used to represent integer values in binary (base 2) notation and were introduced in the C99 standard. They start with the prefix '0b' followed by a sequence of binary digits (0 or 1). For example:

0b1010 (which represents 10 in decimal)

Also Read: 5 Types Of Literals In C & More Explained With Examples!

Special Symbol Tokens In C

Special symbol tokens in C include punctuation marks and other characters used to structure and separate elements in code. Examples of special symbols include commas, semicolons, parentheses, curly braces, and brackets.

They are crucial for writing syntactically correct C programs and help organise the code. Understanding the role and usage of these symbols is essential for maintaining the integrity and readability of C code. Below is a C program sample that illustrates the use of many special symbol tokens in C language.

Code Example:

Click here to view code

Output:

num2 is greater

Explanation:

In this C code sample-

In the main() function, we declare two integer variables, num1 and num2, in a single line using the comma symbol between them.
Then, we assign value 10 to num1 and value 20 to num2 using the assignment operator (=).
Next, we create an if conditional/ control statement, followed by a pair of parentheses ( ). The if-statement allows for conditional execution of code based on a condition.
We have a condition num1 > num2 inside the if-statement to check whether num1 is greater than num2. If this condition is true, the code inside the corresponding block (curly braces{}) will be executed.
- If the condition is true, the if-block containing the printf() function prints the message- 'num1 is greater' to the console.
- If the condition is false, the else-block containing the printf() function prints the message 'num2 is greater' to the console.
The program above showcases the use of multiple special characters and strings.

Examples Of Special Symbol Tokens In C

In the C programming language, several special symbols serve various purposes and have special meanings. Here are some examples of special symbols in C:

Comma (,): The comma separates multiple expressions or variables in a function declaration or function parameters in a function call. For example:

int a, b, c; // declares three integer variables a, b, and c
printf("Hello", "World"); // calls the printf function with two string arguments

Semicolon (;): The semicolon, also known as a statement terminator, terminates statements in C. It indicates the end of a statement and separates multiple statements. For example:

int a = 10; // initializes integer variable a with value 10
printf("Hello World"); // prints "Hello World" to the console

Parentheses (()): Parentheses or simple brackets are used for grouping expressions, passing arguments to functions, and controlling the order of evaluation in expressions. For example:

int sum = (a + b) * c; // calculates the sum of a and b, and then multiplies it by c
printf("Value of a: %d", a); // prints the value of a using a format specifier in parentheses

Braces ({ }): Braces or curly brackets are used to define blocks of code, such as function bodies, loop bodies, and conditional statements. They enclose a group of statements and define their scope. For example:

if (a > b) { // starts a conditional statement block
printf("a is greater than b");
} // ends the conditional statement block

Brackets ([ ]): Square Brackets are used to declare and access elements in arrays and for specifying array sizes in function prototypes. They are also known as opening and closing brackets. For example:

int arr[5]; // declares an integer array with size 5
printf("Value at index 2: %d", arr[2]); // accesses the value at index 2 in the array

Asterisk Symbol(*): Used as the multiplication operator for the multiplication of variables and as the dereference operator for pointer variables. For example:

int a = 5 * 2; // Multiplication
int *ptr; // Pointer declaration
int b = *ptr; // Dereferencing the pointer

Ampersand (&): Used as the address-of operator to get the dynamic memory address of a variable. For example:

int x = 10;
int *ptr = &x; // Pointer to x

Constant Tokens In C

Constant tokens in C are values that do not change during the execution of a program. They are also known as literal values or simply literals. In other words, constant tokens in C represent fixed values, such as numbers, characters, and strings, that remain unchanged throughout the execution of a C program.

Code Snippet:

// Example of constants in C
int num = 10; // integer constant
float pi = 3.14; // floating-point constant
char ch = 'A'; // character constant
char str[] = "Hello"; // string literal

Primary Constant Tokens In C

Primary constants are fundamental values or factors that remain unchanged throughout a process or system. They are fixed and inherent properties that define the characteristics or behavior of a system.

Examples of primary constants include the speed of light in a vacuum (denoted as "c" and approximately equal to 3 x 10^8 meters per second), the gravitational constant (denoted as "G" and used to calculate gravitational forces between objects),

Secondary Constant Tokens In C

Secondary constants are derived or calculated values that depend on primary constants and other variables. They are not fundamental properties but are determined based on relationships or equations involving primary constants. Examples of secondary constants include the Planck constant (denoted as "h" and used in quantum mechanics to describe the relationship between energy and frequency of electromagnetic waves).

For more, read: Understanding Constant In C: How To Define, Syntax, and Types

String Tokens In C

String tokens in C are sequences of characters represented as arrays of characters terminated by a null character (\0). They are used to store and manipulate text or character data in C programs. String tokens are an important user-defined data type widely used in various operations, such as input/output, manipulation, and storage of textual data. String tokens in C are created using double quotes (" ").

Code example:

Click here to view code

Output:

Greeting: Hello, World!

Explanation:

In the C program sample-

We create a string called greeting and assign to it the character array- Hello, World!.
Here, we use the double quote method, and the empty square brackets indicate no fixed number of characters the literal can take.
Then, we use the printf() function to display the value of this string to the console.

Also Read: Strings In C | Declare, Initialize & Use String Functions (+ Examples)

Types Of String Tokens In C

In the C programming language, strings can be represented using different types of string tokens that have some special meaning. Here are some commonly used string token types in C:

Character Arrays: Strings in C are often represented as a single-character array, where each character in the array represents a character in the string. The string is terminated by a null character ('\0'). For example:

char str[] = "Hello, World!";

Character Pointers: C allows you to represent strings using character pointers, which point to the first character of the string. The string is again terminated by a null character ('\0'). For example:

char *str = "Hello, World!";

String Constants: C allows you to directly specify string constants by enclosing them in double quotes. String constants are treated as character arrays and automatically terminated by a null character ('\0'). For example:

"Hello, World!"

Escape Sequences: C provides escape sequences to represent special characters within strings. Escape sequences start with a backslash (\) followed by a specific character representing a particular value. Some commonly used escape sequences are:

\" (Double Quotes)	\t (Horizontal Tab)
\' (Single Quotes)	\r (Carriage Return)
\\ (Backlash)	\b (Backspace)
\n (Newline)	\f (Form Feed)

The Character Set In C

In the C programming language, the character set is based on the ASCII (American Standard Code for Information Interchange) standard. ASCII defines a set of 128 characters, including control characters (such as newline, carriage return, and tab) and printable characters (letters, digits, punctuation marks, and special symbols).

The ASCII character set uses 7 bits to represent each character, allowing for a total of 128 unique characters.
The characters are encoded using integer values ranging from 0 to 127. For example, the character 'A' is represented by the integer value 65, 'B' by 66, and so on.
In addition to the ASCII character set, the C language also supports escape sequences that represent special characters and white space from memory, such as '\n' for newline, '\t' for tab, and '\r' for carriage return.
Furthermore, C provides the concept of wide characters, which are used to handle internationalization and localization.
Wide characters are represented by the wchar_t type and use a wider character encoding, such as Unicode, to support a broader range of characters beyond the ASCII set.

Looking for mentors? Find the perfect mentor for select experienced coding & software experts here.

Conclusion

Tokens in C programming are the fundamental building blocks that allow developers to create complex and powerful applications. By understanding the different types of tokens and their roles within a program, you can write efficient and error-free C code. Whether you're a beginner or an experienced programmer, a solid grasp of tokens in C programming is essential for mastering this versatile language and harnessing its full potential in software development.

Also Read: 100+ Top C Interview Questions With Answers (2024)

Frequently Asked Questions

Q. What are identifiers in C?

Identifier tokens in C are used to name variables, functions, and other basic elements of a program. They are user-defined names that represent contiguous memory locations in the program. It is important to note that when naming identifiers, one must follow specific rules, such as starting with a letter or an underscore. However, they can contain letters (uppercase and lowercase), digits, and underscores.

Q. What are constants in C?

Constant tokens in C are fixed values that do not change during the execution of a program. They can be of various types, such as integer constants, constant variables, floating-point constants, character constants, and string literals. The const keyword is also used to represent fixed values in the program and is assigned during compile time.

Q. What are operators in C?

Operator tokens in C are symbols used to perform operations on operands. They can be arithmetic operators (e.g., +, -, *, /), relational operators (e.g., <, >, ==, !=), logical operators (e.g., &&, ||, !), assignment operators (e.g., =, +=, -=), and more.

Q. What are special symbols in C?

Special symbol tokens in C include punctuation marks and other form of characters that are used to structure and separate program elements in C code. Examples include commas, semicolons, parentheses, braces, and curly brackets. They punctuate/ define syntax and organize C code and are crucial for writing syntactically correct C programs.

Q. What are string literals in C?

String literals in C are a combination of characters enclosed in double quotes (" "). They are used to represent strings, which are sequences of characters, in C programs. String literal tokens in C are immutable, meaning their contents cannot be modified during runtime. They are used for operations such as input/output, manipulation, and storage of textual data.

Q. How are tokens used in C programs?

The term tokens is used to represent a collection of elements, such as keywords, identifiers, constants, single operators, special symbols, and string literals. They are recognized by the compiler and are used to form individual statements, expressions, and other syntactic components of a code. Understanding the different types and usage of tokens in C is essential for writing correct and efficient programs.

Here are a few other topics you must know about:

Shivani Goyal

Manager, Content

An economics graduate with a passion for storytelling, I thrive on crafting content that blends creativity with technical insight. At Unstop, I create in-depth, SEO-driven content that simplifies complex tech topics and covers a wide array of subjects, all designed to inform, engage, and inspire our readers. My goal is to empower others to truly #BeUnstoppable through content that resonates. When I’m not writing, you’ll find me immersed in art, food, or lost in a good book—constantly drawing inspiration from the world around me.