DEV Community

Cover image for Understanding C++ Data Types, Vulnerabilities, and Key Differences with Java
Alex Ricciardi
Alex Ricciardi

Posted on • Originally published at Medium

Understanding C++ Data Types, Vulnerabilities, and Key Differences with Java

This article provides an in-depth look at the various data types in C++, including primitive, derived, and user-defined types, while also addressing common vulnerabilities such as buffer overflows and incorrect type conversions. Additionally, it highlights the key differences between C++ and Java, focusing on how each language handles data types and memory management, with practical code examples for secure programming.

Java and C++ are two Object Oriented Programming (OOP) languages that have benefits and disadvantages. This article explores some of the differences between the two languages, with a focus on C++ data types and vulnerability. As well as, code example on As well as, code examples on how to prevent common issues such as buffer overflows, integer overflows, incorrect type conversions, and null pointer dereferencing in C++.

C++ Data Types

In C++, there are broadly three types of data: Primitive (or Built-in) Data Types, Derived Data Types, and User-defined Data Types, see Figure 1:

Figure 1
C/C++ Data Types
C/C++ Data Types
Note: From “C++ Data Types” by Argarwal (2023)

  • The Primitive types also refer as fundamental types are the basic data types that are predefined

in C++, see below for a list of them:

Integer (int): Stores whole numbers, usually require 4 bytes of memory, and have a range from -2,147,483,648 to 2,147,483,647 (Argarwal, 2023)

Table 1
C++ Integers
C++ Integers
Note: From “Fundamental Types” by C++ Reference (n.d.)

Character (char): Stores character, require, 1 byte, and has a range of -128 to 127 (or 0 to 255 for unsigned char). Note they are technically positive integers storing characters as integers using their underlying ASCII codes.

Wide Character (wchar_t) stores wide characters, usually requiring 2 or 4 bytes depending on the platform.

Boolean (bool): Stores logical values, true or false, usually requires 1 byte. They are technically positive integers, where O is true all other positive integers is false.

Floating Point (float): Stores single-precision floating-point numbers, usually requiring 4 bytes.

Double (double): Stores double-precision floating-point numbers, requiring 8 bytes

long double — extended precision floating-point type. Does not necessarily map to types mandated by IEEE-754. (C++ Reference , n.d., p.1)

Void (void): type with an empty set of values or absence of value, often used in functions that do not return any value.

“type with an empty set of values. It is an incomplete type an external site. that cannot be completed (consequently, objects of type void are disallowed). There are no arrays of void, nor references to void. However, pointers to void. and functions returning type void (procedures in other languages) are permitted.” (C++ Reference , n.d., p.1)

  • Derived data types are derived from primitive data types, and they include:

Array: collection of elements of the same type.
Pointer: memory addresses of variables.
Function: block of code that performs a specific task.
Reference: alias to another variable.
(Argarwal, 2023)

  • User-defined Data Types, as the name says are defined by the user
  1. Class: stores variables and functions into a single unit.
  2. Structure (struct): Similar to classes but with public default access.
  3. Union: Stores different data types in the same memory location.
  4. Enumeration (enum): Sets of names associated with an integer constant, uses extensively in the video game industry.
  5. Typedef: Allown to create or assign new names for existing data types.

Main Differences Between Models: C++ vs Java

Before listing the differences between the C++ and Java data types, I would like to discuss the main differences and similarities between the two languages.

Both languages are Object Oriented Programming (OOP) languages. However, However, C++ is platform-dependent, while Java is platform-independent due to Java’s Virtual Machine (JVM) component that compiles Java to bytecode using an interpreter, which can run on any system with a JVM, this is referred to as “Write Once, Run Anywhere” capability (Eck, 2015). Additionally, C++ supports both procedural and object-oriented programming, whereas Java is strictly OPP, making C++ more suitable for programming operating systems. Furthermore, Java, being strictly object-oriented, lacks support for global variables or free functions (as C++ does). Java encapsulates everything within classes.

Data Types Differences Between C++ and Java

Below is a list of data type differences between C++ and Java.

  • Both languages have primitive data types; however, C++ supports both signed and unsigned types, making C++ primitive data types more flexible but also less secure when unsigned. Java primitive data types are signed by default (Eck, 2015).
  • Java primitive data types have fixed sizes and are platform-independent, whereas C++ primitive data types storage sizes are platform-dependent.
  • C++ provides consistency between primitive and object types. Java also distinguishes between primitives and objects; however, it has “wrapper classes” for primitive types like Integer for int, allowing them to have object-like behavior (Sruthy, 2024).
  • Java does not support structure or union data types, whereas C++ does.
  • C++ supports pointers and reference types, whereas Java has very limited support. This is by choice and for security reasons.
  • C++ variables have a global scope as well as namespace scope. On the other hand, Java variables have no global scope; however, they can have package scope.
  • Java supports documentation comments ‘Javadocs’, whereas C++ does not.

Possible Vulnerabilities When Using C++ Data Types

If not handled right C++ data types may create security issues. Thus, it is crucial to understand and identify C++ data type vulnerabilities. Below is a list of the most common vulnerabilities that may arise with handling C++ data types and on how to mitigate them.

1. Buffer Overflows

Buffer overflow occurs when the data written in memory surpasses a buffer-allocated capacity. This can lead to program crashes or malicious code execution. To prevent buffer overflow, it is important to check bounds before coding, and use library functions resistant to buffer overflow like “fgets.” Refrain from using “scanf,” “strcpy,” “printf,” “get,” and “strcaf,” which are prone to buffer overflows (GuardRails, 2023).

See the code example below.

#include <iostream>
#include <cstdio>

int main() {
    char buffer[10];
    std::cout << "Enter a string: ";
    fgets(buffer, sizeof(buffer), stdin);  // Avoids buffer overflow by 
                                           // limiting input
    std::cout << "You entered: " << buffer << std::endl;
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

2. Integer Overflow and Underflow

Integer overflow occurs when a trying to store in an integer variable exceeds the maximum value an integer can hold (Snyk Security Research Team, 2022). Integer underflow occurs when a value becomes less than the minimum value an integer can hold.

To prevent integer overflow and underflow always validate the values before performing arithmetic operations, see the code example below:

#include <iostream> 
#include <climits> // For INT_MAX  int add(int a, int b) {
    if (a > 0 && b > INT_MAX - a) {
        std::cerr << "Integer overflow detected!" << std::endl;
        return -1;  
    }
    return a + b;
}

int main() {
    int x = INT_MAX - 1;
    int y = 2;
    std::cout << "Result: " << add(x, y) << std::endl;
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

Note that before adding a and b, in the add function, the condition of adding them would cause an overflow is checked, here the sum is greater than the integer max value allowed.

3. Incorrect Type Conversion

Incorrect type conversions often happen when converting a signed integer to an unsigned integer leading to unintentional data loss. This can be avoided by Not-implicitly converting data type, for example, avoid casting double to float or a signed to an unsigned type.

See the code example below for example:

#include <iostream>  

void checkLength(int len) {
    if (len < 0) {
        std::cerr << "Negative length detected!" << std::endl;
        return;
    }
    std::cout << "Length is: " << len << std::endl;
}

int main() {
    // Incorrect implicit conversion    
    unsigned int value = -5;  // unsigned int are always non-negative 
    checkLength(static_cast(value));  // Properly handle conversion
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

4. Pointer initialization

Null pointers are a well-known vulnerability that can lead to program crashing and potential nefarious exploitation. Always check pointers for ‘nullptr’ before dereferencing them.

See the code below for an example.

struct list {
    void *p;
    struct list *next;
};

int f(struct list *l) {
    int max = -1;
    while (l) {
        int i = *((int*)l->p); // Cast void* to int*
        max = i > max ? i : max;
        l = l->next;
    }
    return max;
}
Enter fullscreen mode Exit fullscreen mode

Note that the ‘void ’ is used to store a generic pointer in the linked list and it is cast to ‘int’ before being used.

To summarize, understanding the difference between Java and C++, by understanding C++ data types, along with their associated risks, is critical for writing secure programs. This can be done by carefully managing data types, performing bounds checks, validating type conversions, and handling pointers safely, such as by checking for null values and casting void pointers correctly.


References:

Adam, J. & Kell, S. (2020). Type checking beyond type checkers, via slice & run. In Proceedings of the 11th ACM SIGPLAN International Workshop on Tools for Automatic Program Analysis (TAPAS 2020). Association for Computing Machinery, 23–29. Retrieved from: https://dl-acm-org.csuglobal.idm.oclc.org/doi/10.1145/3427764.3428324

Argarwal, H. (2023, September 23). C++ data types. GoogsforGeeks. https://www.geeksforgeeks.org/cpp-data-types/

C++ Reference (n.d.). Fundamental types. cppreference.com. https://en.cppreference.com/w/cpp/language/types/

Eck, D. J. (2015). Chapter 1 Overview: The mental landscape. Introduction to programming using Java (7th ed.). CC BY-NC-SA 3.0. http://math.hws.edu/javanotes/

GuardRails, (2023, April 27). The Top C++ security vulnerabilities and how to mitigate them. Security Boulevard. https://securityboulevard.com/2023/04/the-top-c-security-vulnerabilities-and-how-to-mitigate-them/

Snyk Security Research Team (2022, August 16). Snyk.https://snyk.io/blog/top-5-c-security-risks/

Sruthy, (2024, March 7). C++ vs Java: Top 30 differences between C++ and Java with examples. Software Testing Help. https://www.softwaretestinghelp.com/cpp-vs-java/


Originally published at Alex.omegapy on Medium published by Level UP Coding on October 12, 2024.

Top comments (0)