DEV Community

Calin Baenen
Calin Baenen

Posted on

Is there a way to have raw-types in (modern) C++?

What are "raw" types?

Raw types(in Java) is a type who has type-arguments (generics/templates*), but whose type-arguments are not specified in a declaration.
Consider the following (Java) code:

class NeverRaw1 {
    NeverRaw1(int i) { this.value = i; }
    int value;
}

class PossiblyRaw<T> {
    PossiblyRaw(T v) { this.value = v; }
    T value;
}

class NeverRaw2 extends PossiblyRaw<int> {
    NeverRaw2(int i) { this.value = i; }
}

// ... Some code later...

// This isn't raw because a type argument is specified.
PossiblyRaw<int> notRaw = new PossiblyRaw<>(10);
// This IS raw because none are specified
PossiblyRaw uncooked = new PossiblyRaw<int>(10);

// Invalid. Mismatched types.
notRaw = new PossiblyRaw<float>(10.56);
// Valid. No type mismatch because none is specified.
uncooked = new PossiblyRaw<float>(10.56);
Enter fullscreen mode Exit fullscreen mode

Why would you want types?

So, the first thought that probably popped into your head is "Hey. Isn't that not type-safe?", and to that, I say "Yes. It can be unsafe at times. But then it can also be utilized.".

So? Where can I use this?
Well, here's where I'm stuck. - I'm implementing a Token type for Janky, and I want to have a parse(std::string|char*) function that returns an array of Token.
How is this a problem?
It's a problem because even if you want to return an array of something, you must define the template arguments.
My Token type is written as such:

template<typename T, T v> struct Token {
    T              value = v;
    TokenType      type = TokenType::_UNASSIGNED;
    unsigned short col = 0;
    unsigned short ln = 0;
};
Enter fullscreen mode Exit fullscreen mode

And I can't create any abstraction, since all pieces of this structure are important to have. -- And because members aren't preserved when you say a piece of data is of its parent type.

So...

What's the best solution here?

Thanks!
Cheers!

Top comments (18)

Collapse
 
totally_chase profile image
Phantz • Edited

You've essentially stumbled upon the problem of creating a function to return polymorphic containers, without giving up type safety. Raw types may seem like an easy , type-unsafe way out, but let's simplify the problem and see if a better solution exists.

Imagine a function that returns an empty list. Now, an empty list can be safely assigned to any std::list<T>, for all T. Because it's empty! Would you use raw types there? How about this instead-

#include <list>

template <typename T>
std::list<T> nil()
{
    std::list<T> l{};
    return l;
}
Enter fullscreen mode Exit fullscreen mode

It's type safe, and you didn't need "raw types"! This is one of the fundamental concepts in type theory - the forall quantification. In Haskell, it's equivalent to-

nil :: forall a. [a]
nil = []
Enter fullscreen mode Exit fullscreen mode

Which essentially reads- "The type of nil is list of a for all a".

I suggest doing something similar for your usecase. Make your parse function accept a type parameter.

Addendum: You're not gonna get the awesomesauce type inference stuff in C++ with this. In that example, you'll have to specify the type parameter to the nil call even if you already have a typed left hand side. But I'm assuming you're more interested in type safety than ease of use. If you're looking for both, hindley-milner is that way :)

Collapse
 
calinzbaenen profile image
Calin Baenen

Could you provide an example of how your example helps? I still think I need raw types, because the goal is to have multiples match up, so that Token<int, 2> can also be grouped with a Token<std::string, "hello">. - With raw types, I could just say Token, and I would only have to figure out what type is being used (which isn't too hard in C++).

Collapse
 
totally_chase profile image
Phantz

Ah, you want a heterogenous array, not a polymorphic one. If it's not at all possible to design your API in a way to need heterogenous arrays - your only option is to use std::variant. You can't have just Token though, since that immediately kills static typing.

You'll most likely need to throw in a whole bunch of holds_alternative checks before you can actually use the value though. Yeah, I know it's painful - but that's not an inherent drawback of type safe static typing, it's just C++.

Thread Thread
 
calinzbaenen profile image
Calin Baenen

but that's not an inherent drawback of type safe static typing, it's just C++.

C++ is my favorite language, but in this regard, it treats things very stupidly.
Sure, Token (kind of) kills static typing, but you must admit, for the purpose of having a flexible array (or any structure), allowing raw types would be nice as an option. -- Or, at least make it easier to reach the end goal for something like this.
(Maybe I'll just make my own heterogeneous array implementation, if that's considered "okay" by most people's logic.)

Collapse
 
totally_chase profile image
Phantz • Edited

Also, regarding an API redesign from my previous reply. This is what I generally see token types implemented as, for parsers/lexers-

data TokenValue = StringLiteral String | IntConst Int

data Token = Token
  { tokenValue :: TokenValue
  ; tokenType :: TokenType
  ; posColumn :: Int
  ; posLine :: Int
  }
Enter fullscreen mode Exit fullscreen mode

That's haskell, but it should be readable regardless. Notice how the raw token value is the std::variant, and the Token is a wrapper around it. The T that you pass to your Token template is not present here, because it doesn't need to be present. TokenValue is actually a tagged union. std::variant is a really roundabout and overly complex way of doing tagged unions - so it's equivalent.

I really don't think you'll ever need to track the T that you use for the value field. It should just be tracked by the tagged union (since it's a runtime concept).

Collapse
 
sfleroy profile image
Leroy

I don't think this is possible in cpp. All templates are compiled to typed classes. Maybe if you involve some C trickery where you use a void* or unioned a ton of types together but there are just a ton of its and buts with that. I'd try to come up with an alternative approach to the problem

Collapse
 
calinzbaenen profile image
Calin Baenen

Is there any way I can store a value with a type in the token, so that I can store additional info?
I did think of void*, and nullptr as default values. But that doesn't solve my problem of needed a raw type (unless I want all the data to be untraceable).

Collapse
 
calinzbaenen profile image
Calin Baenen

By untraceable, I mean if I have void* value, it'd be impossible to get the origin-type of the pointer back.
Plus, this is also bad, because that means all the information has to be a pointer, which could lead to many (dangling) bugs.

Collapse
 
pauljlucas profile image
Paul J. Lucas

Your Token should just contain a std::string inside it (that's a substring of the original string that was parsed). That's it. No templates involved at all. I don't see why that's a problem.

Collapse
 
calinzbaenen profile image
Calin Baenen

Because I want the data to be extremely-literal. - I want the Tokens to reflect their C++ values.

For example, I want a number/int token to be parsed, and reflected with C++'s built-in int type. E.g. Janky:13 = C++:Token<int>(13);

The reason I don't wanna use an std::string is because I have to parse the value later, when I could do it all at once, and not worry about it later. (This would be useful for easy-arithmetic between 2 tokens of the same type.)

Collapse
 
pauljlucas profile image
Paul J. Lucas

Have you looked at std::any?

Thread Thread
 
calinzbaenen profile image
Calin Baenen

No.
But it looks interesting.

Thanks for pointing me in s direction.

Collapse
 
pgradot profile image
Pierre Gradot • Edited

From what I understand, you want a type that can hold several possible types. This is a union in C/C++.

From C++17, you can use en.cppreference.com/w/cpp/utility/...

Define you own variant type that can hold any kind of tokens you want to handle. Them process variables of this variant type.

Collapse
 
calinzbaenen profile image
Calin Baenen

I thought of a solution. Just create an abstraction in the form of a new struct (TokenValue):

template<typename T=void*, T v=nullptr> struct TokenInfo {
    T value;
    TokenType type;
}

struct Token {
    TokenInfo info;
};
Enter fullscreen mode Exit fullscreen mode
Collapse
 
calinzbaenen profile image
Calin Baenen

This doesn't work, obviously. I'm a dumbass

Collapse
 
mellen profile image
Matt Ellen-Tsivintzeli

If you were doing this in Java, how would you know the type of the value?

Collapse
 
calinzbaenen profile image
Calin Baenen

Well, unlike in Java, you can actually tell the type of a template argument at runtime! HAH-HAH!

Collapse
 
calinzbaenen profile image
Calin Baenen

Good point. - I'm running myself for a loop on how I can actually get what I want.