C Programming Language
Over half a century ago, in the year 1969, the C programming language was created at Bell Labs. The primary goal was to use this language for developing the Unix operating system, which turned out to be a highly successful venture. Despite receiving frequent criticism, C is still one of the most widely used programming languages globally, and it is the most popular one for developing embedded systems.
The reasons behind C's popularity and success are many and varied. First, C compilers are available for nearly every processor, ranging from tiny DSPs used in hearing aids to supercomputers. This versatility makes it possible to use C for a wide range of applications.
Furthermore, C compiled code can be highly efficient without hidden costs. Programmers can predict running times to some extent even before testing, without using tools for the worst-case execution time approximation. This saves a lot of time and effort and helps to streamline the development process.
Additionally, C allows writing compact code thanks to limited verbosity and the availability of many built-in operators. This feature makes it easier to write and debug code, which can result in faster development times.
It is worth noting that C is defined by international standards. It was first standardized in 1989 by the American National Standards Institute (referred to as ANSI C), and then by the International Organization for Standardization (ISO). This ensures that C is a highly reliable and stable programming language.
Moreover, C, possibly with extensions, allows easy access to the hardware, a crucial aspect for developing embedded software. This feature provides developers with greater control over the hardware, making it easier to optimize performance and functionality.
C has a long history of usage in all kinds of systems, including safety, security, mission, and business-critical systems. This makes it a highly versatile programming language that can be used in a wide range of applications.
Lastly, C is widely supported by all sorts of tools, making it easier to develop and maintain software written in C. This includes compilers, debuggers, and other software development tools.
C Is Not Fully Defined
There are claims that C++ will eventually replace C, but this is unlikely in the embedded software industry. C++ is a very complex language that is constantly evolving, which contrasts with industrial best practices. C, on the other hand, is a stable and compact language. However, C is subject to criticism regarding its behavior and the as-if rule. This rule allows the compiler to optimize the code while maintaining the observable behavior. C is not fully defined in this respect, and there are four classes of non-definite behaviors.
- Implementation-defined behavior: unspecified behavior where each implementation documents how the choice is made; e.g., the sizes and precise representations of the standard integer types;
- Locale-specific behavior: behavior that depends on local conventions of nationality, culture, and language that each implementation documents; e.g., character sets and how characters are displayed;
- Undefined behavior: behavior, upon use of a non-portable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements; e.g., attempting to write a string literal constant or shifting an expression by a negative number or by an amount greater than or equal to the width of the promoted expression;
- Unspecified behavior: use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance;e.g., the order in which sub-expressions are evaluated.
Non-definite behavior in programming languages is a result of two factors. Firstly, efficient compilers can be written for almost any hardware architecture. Secondly, there are many compilers available from different vendors, and the language is standardized.
Regarding the second point, it's important to note that ISO typically standardizes existing practices while considering input from participating vendors. They also prioritize backward compatibility. Therefore, when there are divergent implementations, non-definite behavior may be the only feasible solution.
In C, the main objective is to obtain efficient code with no hidden costs. This is the reason why there is no run-time error checking. However, easy access to hardware can make it simple to corrupt the program state.
One of the drawbacks of code compactness is that the language can be easily misunderstood and misused.
The C language is expected to remain true to its original purpose and will continue to be used in the development of embedded systems. However, it is worth noting that certain features of C may conflict with safety and security requirements. Therefore, it is essential to subset the language for critical applications. This has been recognized early and is now mandated or recommended by various safety and security-related industrial standards, such as IEC 61508 for industrial applications, ISO 26262 for automotive, CENELEC EN 50128 for railways, RTCA DO-178B/C for aerospace, and FDA's General Principles of Software Validation.
C Can Be Difficult To Read
It is important to use C language features correctly, as improper usage can adversely affect the readability and understandability of the program. Some of the features that can cause issues are:
• The preprocessing phase
• A large number of operators with complex precedence and associativity rules
• Numerous control-flow mechanisms, some with complex semantics (such as goto, switch, for, break, continue, setjmp/longjmp, etc.)
• Implicit conversions that follow intricate rules
• Two types of comment markers that interact in a tricky way with line splicing (i.e., splitting logical lines into multiple physical lines using trailing backslashes as line-continuation markers).
The International Obfuscated C Code Contest, which has been running since 1984, awards a prize to the most obscure/obfuscated C programs that follow specific rules. As already mentioned, ensuring code readability and understandability is crucial for the effectiveness of code reviews and has an obvious impact on other program properties such as maintainability.
MISRA C
The MISRA,The Motor Industry Software Reliability Association, project was initiated in 1990 with the aim of providing best practice guidelines for the safe and secure development of both embedded control systems and standalone software. Originally, the project was part of the UK Government's "SafeIT" program but later it became self-supporting with MIRA Ltd, now HORIBA MIRA Ltd, providing the project management support. The core activities of MISRA include developing guidance in specific technical areas such as programming languages, model-based development, automatic code generation, software readiness for production, safety analysis, safety cases, and more. In November 1994, MISRA published its "Development guidelines for vehicle-based software," also known as "The MISRA Guidelines." This was the first automotive publication concerning functional safety, more than 10 years before work began on ISO 26262.
The MISRA guidelines prescribe the use of a restricted subset of a standardized structured language. As a response to this, the MISRA consortium started developing the MISRA C guidelines. At the time, Ford and Land Rover were each developing their own in-house rules for vehicle-based C software. It was recognized that creating a common set of guidelines would be more beneficial to the industry. The first version of the MISRA C guidelines was published in 1998 and it gained significant industrial attention.
In 2004, MISRA received numerous comments from its users, many of which were from non-automotive industries, surpassing their expectations. As a result, MISRA published an enhanced version of the C guidelines, specifically targeting all industries that develop C software for high-integrity and critical systems. The success of MISRA C prompted MISRA to release a set of MISRA C++ guidelines in 2008, as C++ is also used in critical contexts.
MISRA C:1998 and MISRA C:2004 are both guidelines that target the 1990 version of the C Standard. A revised set of guidelines called MISRA C:2012 was published in 2013, which supports both C99 and C90 (sometimes referred to as C95 in their amended and corrected form). Compared to the previous versions, MISRA C:2012 covers more language issues and provides a more precise specification of the guidelines with improved rationales and examples.
The "JSF Air Vehicle C++ Coding Standards for the System Development and Demonstration Program" had a significant impact on the development of coding standards for C++. This, in turn, led to the creation of MISRA C++:2008, which influenced the development of MISRA C:2012. The UK Ministry of Defence supported the development of the vulnerabilities document, which helped in identifying various behaviors in ISO C++. This document is similar to Annex J in ISO C, which is currently missing. Its absence makes it difficult to ensure that the guidelines are comprehensive. Additionally, MISRA C played a vital role in shaping NASA's "JPL Institutional Coding Standard for the C Programming Language" and other coding standards.
The MISRA C guidelines aim to enhance the safety and security of embedded or standalone systems. The guidelines define a subset of the C language that eliminates or minimizes the possibility of errors. They prohibit critical non-definite behavior and restrict the use of implementation-defined behavior and compiler extensions. Additionally, they control the use of language features that can be easily misused or misunderstood. Overall, the guidelines are designed to improve the reliability, readability, portability, and maintainability of the systems.
There are two kinds of MISRA C guidelines.
- Directive: a guideline where the information concerning compliance is generally not fully contained in the source code: requirements, specifications, design, and etc. all may have to be taken into account. Static analysis tools may assist in checking compliance with respect to directives if provided with extra information not derivable from the source code.
- Rule: a guideline where information concerning compliance is fully contained in the source code. Discounting undecidability, static analysis tools should, in principle, be capable of checking compliance with respect to the rule.
The MISRA C coding standard is intended to be used in conjunction with a documented development process, where any deviations from the standard that can be justified will be authorized and recorded. To make this process easier, each guideline in the MISRA C standard has been assigned a category.
- Mandatory: C code that complies to MISRA C must comply with every mandatory guideline; deviation is not permitted.
- Required: C code that complies to MISRA C shall comply with every required guideline; a formal deviation is required where this is not the case.
- Advisory: these are recommendations that should be followed as far as is reasonably practical; formal deviation is not required, but non-compliances should be documented.
Organizations or projects have the option to treat required guidelines as mandatory and advisory guidelines as required or mandatory. Adopting MISRA Compliance:2016 allows advisory guidelines to be disregarded and marked as "Disapplied" when they are deemed to have no value, such as when adopted code has not been developed to comply with the MISRA C guidelines. However, the decision to disapply a guideline should not be taken lightly. It is important to create a guideline recategorization plan that includes the rationale for any decision to disapply a guideline.
MISRA C rules are categorized based on the amount of code that needs to be analyzed to detect all violations. Single Translation Unit refers to checking each translation unit independently. System requires checking more than the translation unit in question, if not all the source code that constitutes the system.
MISRA C:2012 Amendment 1 was published in 2016 to extend the applicability of MISRA C:2012 to industries and applications that prioritize data-security. It includes 14 new guidelines (1 directive and 13 rules) to complete the coverage of ISO/IEC TS 17961:2013, which is a set of rules for secure coding in C.
The coverage of the CERT C Coding Standard is almost complete with Amendment 1. As a result, MISRA C is the preferred language subset for all industries developing embedded systems in C that are safety- and/or security-critical.
Important Rules
MISRA established standards to ensure safe and reliable automotive software. Strict rules minimize risks and prevent vulnerabilities. We'll discuss critical rules for code clarity and safety adherence. Understanding them helps developers navigate the intricacies of software development.
Rule D.4.1 (required): Run-time failures shall be minimized
It is important to ensure that there are measures in place to detect and handle run-time errors. This is particularly crucial for safety-critical systems, as C language does not have built-in mechanisms to handle dynamic errors. Therefore, special attention is required to ensure the safety and reliability of such systems.
As a developer, it is important to keep a watchful eye on the occurrence of run-time errors while coding. To prevent such errors, checks should be introduced in the code wherever necessary, while also considering performance and code-size constraints. Below are some areas that should be taken into consideration, as they are prone to run-time errors.
Arithmetic Errors
These occur as the resultant of the evaluation of expressions giving rise to divide by zero error, overflow, or underflow scenarios.
{
num1 = 10;
for(i=0;i<10;i++)
{
result = num1/i; // Noncompliant
}
In the example, the literal i is initially initialized to zero; hence on dividing num1 by i, the zero division error is encountered with.
Pointer Arithmetic
This deals with the calculation of addresses where care must be taken to ensure that the addresses computed should be reasonable and point to places meaningful.
Left Shifts
Care must be taken while performing left shifts, as the Most Significant Bit is lost resulting in the overflow.
Array Bound Errors
The array indices while being used for indexing the array should be ensured that they are within the bounds specified.
Rule R.1.3 (required): There shall be no occurrence of undefined or critical unspecified behavior
The ISO standard specifies a minimum set of about 91 characters that all compilers must support. This means that even if a compiler supports a larger set of characters, only these 91 characters should be used. The standard has defined a set of escape sequences including \n, \t, \v, \b, \r, \f, \a, \\, ?, \', \", \[Octal Number], and \x[Hexadecimal Number] that can be used. It is important to note that using escape sequences other than these can lead to unpredictable behavior.
const char_t a[ 2 ] = "\k"; // Noncompliant
const char_t b[ 2 ] = "\b"; // Compliant
Rule D.4.1 (required): The character sequences /* and // shall not be used within a comment
The nesting of comments is not something that is supported in C. The comment starts on detecting /* and ends on the first occurrence of */ irrespective of any nesting of comments being attempted. A noncompliant code example supporting the rule statement is shown as follows.
/* This is a multi-line comment statement.
/* Nesting of comments leads to unexpected behavior*/
The actual comment ends here */
The example shows a multi-statement comment section which is begun with /*, but in the very next line, there is a comment line ending with */, whereas the actual comment ends in the next line. Now this might confuse the developer as which the actual ending of the comment might be, as the final line gets uncommented, due to the early detection of the */. Hence nesting of the comments is not used.
Rule R.7.1 (required): Octal constants should not be used
Any integer constant that begins with “0” (zero) is termed as an octal constant. The usage of the octal constants is denied, because this may cause ambiguous scenarios. Following example illustrates this rule. Consider a user who wants to assign fixed length integer constants to variables. The array initialization is done as follows.
data [0] = 500; /* set to decimal 500*/
data [1] = 071; /* set to decimal 57*/
data [2] = 520; /* set to decimal 520*/
In the code snippet, the user assigns the decimal value “071” to data, thinking that it would be taken in as an integer constant, but the compiler recognizes it as an octal value as it has begun with a “0” and hence assigns the decimal value of the octal constant “071” which is “57” to data. Therefore it is better not to use octal constants, yet zero is an exception as it holds the same value even in the octal representation.
Rule R.8.1 (required): Types shall be explicitly specified
Implicit typing of variables and functions though supported by some C compilers shall not be used according to the rule, as it might lead to confusions. A supporting example shows a noncompliant form of the rule along with its compliant solution in the comments is shown as follows.
extern x; // Compliant solution - extern int16_t x;
const x; // Compliant solution - const int16_t x;
static fun(void); // Compliant solution - static int16_t fun(void);
Rule R.9.1 (required): The value of an object with automatic storage duration shall not be read before it has been set
Normally when the integer variable is declared without any storage specifier, the default specifier it takes is, the auto class. These integer variables shall not be used before getting initialized, to avoid any unexpected behavior due to garbage values. In the example below the variable data returned by the function would hold garbage value, due to the conditional construct, as it is not initialized before being used.
{
int data, value;
value=0;
if (value==1)
{
data = 45;
}
return data;
}
The intention of this rule is that all the variables have to be initialized before being used. The initialization need not necessarily be done at the time of the declaration but at some other part of the code but before the variable is put into use.
Rule R.10.1 (required): Operands shall not be of an inappropriate essential type
There are rules that are to be followed on using the operators over the operands. One such is the rule that talks about the usage of the Bitwise operator over its operands. The built-in Bitwise operators (~, >>, >>=, &, &=, ^, ^=, |, and |=), if used on the signed integer constants, yield implementation-dependent results. The most significant of them is the left shift which, if performed on the signed integers, may cause the signed bit, which is the Most Significant Bit, to be lost in the process, which leads to erroneous results. Hence, the bitwise operators are not used against the signed integers. A supporting illustration is shown as follows.
if ((uint16_value & int16_data) == 0x246U)
if (~int16_data == 0x246U)
The example shows a bitwise operation being performed between an unsigned and a signed integer and another bitwise operation being performed on a signed integer. Both of these operations might yield results far from the unexpected one.
Rule R.11.3 (required): A cast shall not be performed between a pointer to object type and a pointer to a different object type
Rule R.11.3 (required): A conversion should not be performed between a pointer to object and an integer type
These rules are extracted from a set of rules that talk about conversions from one type to another. These rules in particular talk about things that have to be kept in mind while dealing with pointers and conversions. Pointer types shall neither be made to cast into other types nor other types be cast into pointers. It is because if an address value being held by a pointer variable is assigned to an integer variable, it not be able to hold the complete value of the address due to its size. In this case, there occurs data loss, and further computations involving these variables will not yield the expected results. An example supporting this rule is shown.
int* data;
int addr_value = (int) &data;
The example shows a pointer data variable being type cast into an integer and its value being assigned to an integer value. Data loss may occur if the addr_value is not large enough to accommodate the address value being assigned to it.
Rule R.12.1 (advisory): The precedence of operators within expressions should be made explicit
This rule is from the set of rules that are designed to be followed for evaluating or forming an expression for evaluation. Precedence is normally followed for the evaluation of an expression to arrive at the intended result. But the rules of precedence can be confusing at times. Hence in order to avoid this confusion which may cause unpredictable results later, it is advisable to have a limited dependency on the rules of precedence. The expression can be evaluated according to the precedence by making use of parentheses, in case of complex statements. However, there should not be overusing of the parentheses. A noncompliant code example is shown.
x = a == b ? a : a - b; // Noncompliant; Compliant Usage -> x = ( a == b ) ? a : ( a - b );
x = a + b - c + d; // Noncompliant; Compliant Usage -> x = ( a + b ) - ( c + d );
x = a * 3 + c + d; // Noncompliant; Compliant Usage -> x = ( a * 3 ) + c + d;
Rule R.16.3 (required): An unconditional break statement shall terminate every switch clause
Rule R.16.4 (required): Every switch statement shall have a default label
Rule R.16.5 (required): A default label shall appear as either the first or the last switch label of a switch statement
Rule R.16.6 (required): Every switch statement shall have at least two switch clauses
Rule R.16.7 (required): A switch expression shall not have essentially Boolean type
These are few rules to be followed while writing a switch statement. The switch cases inclusive of the default case mark their end when an unconditional break statement is introduced. The usage of the break statement is permitted only in the
case of a switch statement, and an exception exists in case of an empty case clause. The switch statement should have a default label which can be present either at the end or at the top under a switch statement. The switch expression is prohibited in using a Boolean type, as the result of a Boolean computation is either true (1) or false (0), in which case a simple if…else construct would suffice as there would be only two choices to decide upon. Hence switch statements are expected to have at least two cases to prove their usage. These are illustrated in example.
switch (value) /* value is not of type Boolean*/
{
case 0:
statement 1;
break; /* break statement is required after the case statements*/
case 1: /* empty case clause, no break is required*/
case 2:
statement 1;
break;
default:
err_flag_count = 1;
break;
}
Rule R.17.2 (required): Functions shall not call themselves, either directly or indirectly.
Recursive functions are not considered to be used in case of safety-critical systems. This is because the usage of the recursive functions consumes the stack space and causes the danger of depleting the same. The recursive function requires high control, else it would not be possible to determine the worst-case stack usage. A supporting noncompliant code example is shown.
void func()
{
printf("This is a recursive function");
func(); //Noncompliant
}
Rule R.9.5 (required): Where designated initializers are used to initialize an array object, the size of the array shall be specified explicitly
The declaration of an array without specifying the size of the array is possible. But it is not allowed according to the rules, as that might be unclear. An illustration on this is shown.
int arr1 [ ];
int arr2 [ ] = { [0] = 1, [12] = 36, [4] = 93 };
int pirate [ ] = { 2, 4, 8, 42, 501, 90210, 7, 1776 }
In the example, all the lines of array declaration are noncompliant as the size of the array is not specified explicitly. In the second line of declaration, the highest size defines the size implicitly, and in the third line, the total number of items defines the size, but these implicit declarations of size might be unclear, hence not used according to the rule.
Rule R.19.2 (advisory): The union keyword should not be used
The usage of union in order to access an object in different ways may cause mis-interpretation of the data. Hence this rule was designed to prevent the usage of unions in any circumstances. An example of a noncompliant code is shown.
union U1 // Noncompliant
{
float j;
int i;
}
Rule R.20.4 (required): A macro shall not be defined with the same name as a keyword
The reserved identifiers, function, and macros that come under the standard library should not be defined or undefined or redefined. The names of the standard library macros, functions, and objects are prohibited to be reused. An example illustration is discussed.
#define sqrt cathy
In the example, the library function sqrt is being redefined, such that wherever the keyword sqrt occurs in the code, the pre-processor is instructed to just replace the same with the value “cathy,” where in a real case, the library function sqrt comes under the “math library” and is used to compute the square root of a given number.
Tools and Checkers
Static analysis is an essential part of the software development process that involves the automatic examination of source code without executing the application. It is an effective technique that can help to identify potential errors, security vulnerabilities, and other defects in the code, thus improving the quality and reliability of embedded software. For embedded software, there are various types of static analysis techniques that can be employed to detect different types of issues:
- Control flow analysis: This type of analysis checks the control flow of the code to identify potential errors, such as unreachable code and infinite loops.
- Data flow analysis: This type of analysis checks the flow of data through the code to identify potential errors, such as data races and memory leaks.
- Pointer analysis: This type of analysis checks the use of pointers in the code to identify potential errors, such as dangling pointers and buffer overflows.
- Security analysis: This type of analysis checks the code for potential security vulnerabilities, such as buffer overflows and SQL injection.
- Compliance analysis: This type of analysis checks the code for compliance with a specific standard, such as MISRA C or DO-178B.
Automated tools can scan C codebase for MISRA C violations, generating detailed reports on potential issues and their locations.
LDRA Tool Suite (proprietary)
The LDRA Tool Suite is a software analysis tool from LDRA that provides automated testing, static analysis, and dynamic testing capabilities. It includes MISRA C compliance checking and helps ensure the safety and reliability of embedded systems software.
Parasoft C/C++test (proprietary)
This tool offers robust MISRA C support, including rule checking, automated reporting, and integration with development environments. It can identify coding practices that might not strictly violate the standard but could introduce maintainability or safety concerns.
PC-lint Plus (proprietary)
PC-lint Plus comes with a comprehensive rule set covering widely recognized coding standards such as MISRA, CERT-C and AUTOSAR. With an unparalleled focus on seamless integration and automation, PC-lint Plus stands out as the definitive solution for consistent, reliable, and superior quality C and C++ source code analysis.
Cppcheck (free software/open source)
It provides unique code analysis to detect bugs and focuses on detecting undefined behaviour and dangerous coding constructs. The goal is to have very few false positives. Cppcheck covers almost all the MISRA C 2012 rules. Including the amendments. Together with a C compiler you get full coverage.
References
- van der Linden, P. (1994). Expert C Programming. Prentice Hall.
- Bagnara, R., Bagnara, A., & Hill, P. M. (2019). The MISRA C coding standard: A key enabler for the development of safety-and security-critical embedded software. In embedded world Conference 2019—Proceedings (pp. 543-553).
- MISRA, MISRA C:2012 Addendum 2 (2018)— Coverage of MISRA C:2012 (including Amendment 1) against ISO/IEC TS 17961:2013 “C Secure”, 2nd ed. Nuneaton, Warwickshire CV10 0TU, UK: HORIBA MIRA Ltd.
- Yamili, Y. C., & Kathiresh, M. (2021). Autosar and misra coding standards. In Automotive Embedded Systems: Key Technologies, Innovations, and Applications (pp. 37-70). Cham: Springer International Publishing.
Top comments (0)