What are "raw" types?
Raw types(in Java) is a type who has type-arguments (generics/templates*), but whose type-arguments are not specified in a declaration.
Consider the following (Java) code:
class NeverRaw1 {
NeverRaw1(int i) { this.value = i; }
int value;
}
class PossiblyRaw<T> {
PossiblyRaw(T v) { this.value = v; }
T value;
}
class NeverRaw2 extends PossiblyRaw<int> {
NeverRaw2(int i) { this.value = i; }
}
// ... Some code later...
// This isn't raw because a type argument is specified.
PossiblyRaw<int> notRaw = new PossiblyRaw<>(10);
// This IS raw because none are specified
PossiblyRaw uncooked = new PossiblyRaw<int>(10);
// Invalid. Mismatched types.
notRaw = new PossiblyRaw<float>(10.56);
// Valid. No type mismatch because none is specified.
uncooked = new PossiblyRaw<float>(10.56);
Why would you want types?
So, the first thought that probably popped into your head is "Hey. Isn't that not type-safe?", and to that, I say "Yes. It can be unsafe at times. But then it can also be utilized.".
So? Where can I use this?
Well, here's where I'm stuck. - I'm implementing a Token
type for Janky, and I want to have a parse(std::string|char*)
function that returns an array of Token
.
How is this a problem?
It's a problem because even if you want to return an array of something, you must define the template arguments.
My Token
type is written as such:
template<typename T, T v> struct Token {
T value = v;
TokenType type = TokenType::_UNASSIGNED;
unsigned short col = 0;
unsigned short ln = 0;
};
And I can't create any abstraction, since all pieces of this structure are important to have. -- And because members aren't preserved when you say a piece of data is of its parent type.
So...
What's the best solution here?
Top comments (18)
You've essentially stumbled upon the problem of creating a function to return polymorphic containers, without giving up type safety. Raw types may seem like an easy , type-unsafe way out, but let's simplify the problem and see if a better solution exists.
Imagine a function that returns an empty list. Now, an empty list can be safely assigned to any
std::list<T>
, for allT
. Because it's empty! Would you use raw types there? How about this instead-It's type safe, and you didn't need "raw types"! This is one of the fundamental concepts in type theory - the
forall
quantification. In Haskell, it's equivalent to-Which essentially reads- "The type of
nil
islist of a
for alla
".I suggest doing something similar for your usecase. Make your
parse
function accept a type parameter.Addendum: You're not gonna get the awesomesauce type inference stuff in C++ with this. In that example, you'll have to specify the type parameter to the
nil
call even if you already have a typed left hand side. But I'm assuming you're more interested in type safety than ease of use. If you're looking for both, hindley-milner is that way :)Could you provide an example of how your example helps? I still think I need raw types, because the goal is to have multiples match up, so that
Token<int, 2>
can also be grouped with aToken<std::string, "hello">
. - With raw types, I could just sayToken
, and I would only have to figure out what type is being used (which isn't too hard in C++).Ah, you want a heterogenous array, not a polymorphic one. If it's not at all possible to design your API in a way to need heterogenous arrays - your only option is to use
std::variant
. You can't have justToken
though, since that immediately kills static typing.You'll most likely need to throw in a whole bunch of
holds_alternative
checks before you can actually use the value though. Yeah, I know it's painful - but that's not an inherent drawback of type safe static typing, it's just C++.C++ is my favorite language, but in this regard, it treats things very stupidly.
Sure,
Token
(kind of) kills static typing, but you must admit, for the purpose of having a flexible array (or any structure), allowing raw types would be nice as an option. -- Or, at least make it easier to reach the end goal for something like this.(Maybe I'll just make my own heterogeneous array implementation, if that's considered "okay" by most people's logic.)
Also, regarding an API redesign from my previous reply. This is what I generally see token types implemented as, for parsers/lexers-
That's haskell, but it should be readable regardless. Notice how the raw token value is the
std::variant
, and theToken
is a wrapper around it. TheT
that you pass to yourToken
template is not present here, because it doesn't need to be present.TokenValue
is actually a tagged union.std::variant
is a really roundabout and overly complex way of doing tagged unions - so it's equivalent.I really don't think you'll ever need to track the
T
that you use for thevalue
field. It should just be tracked by the tagged union (since it's a runtime concept).I don't think this is possible in cpp. All templates are compiled to typed classes. Maybe if you involve some C trickery where you use a void* or unioned a ton of types together but there are just a ton of its and buts with that. I'd try to come up with an alternative approach to the problem
Is there any way I can store a value with a type in the token, so that I can store additional info?
I did think of
void*
, andnullptr
as default values. But that doesn't solve my problem of needed a raw type (unless I want all the data to be untraceable).By untraceable, I mean if I have
void* value
, it'd be impossible to get the origin-type of the pointer back.Plus, this is also bad, because that means all the information has to be a pointer, which could lead to many (dangling) bugs.
Your
Token
should just contain astd::string
inside it (that's a substring of the original string that was parsed). That's it. No templates involved at all. I don't see why that's a problem.Because I want the data to be extremely-literal. - I want the
Token
s to reflect their C++ values.For example, I want a number/int token to be parsed, and reflected with C++'s built-in
int
type. E.g. Janky:13
= C++:Token<int>(13);
The reason I don't wanna use an
std::string
is because I have to parse the value later, when I could do it all at once, and not worry about it later. (This would be useful for easy-arithmetic between 2 tokens of the same type.)Have you looked at
std::any
?No.
But it looks interesting.
Thanks for pointing me in s direction.
From what I understand, you want a type that can hold several possible types. This is a union in C/C++.
From C++17, you can use en.cppreference.com/w/cpp/utility/...
Define you own variant type that can hold any kind of tokens you want to handle. Them process variables of this variant type.
I thought of a solution. Just create an abstraction in the form of a new struct (
TokenValue
):This doesn't work, obviously. I'm a dumbass
If you were doing this in Java, how would you know the type of the value?
Well, unlike in Java, you can actually tell the type of a template argument at runtime! HAH-HAH!
Good point. - I'm running myself for a loop on how I can actually get what I want.