Quick Friday hit for the 0 people following along my project and the dozen people discovering this years later from Google.
There aren't any good examples on creating a Ruby class with mruby that encapsulates data not used within the VM. This is something you need when you're using mruby to interface with a C library that provides it's own data types. You want to call methods in Ruby which run C code calling out to the external library. Examples include database systems, graphics APIs and interacting with the operating system.
It's a different though simpler process with mruby than mainline Ruby. C Ruby requires a separate allocate method before initialize. With mruby we can perform all operations with creating and storing C data all within the initialize method.
We'll use a contrived Foo_data struct.
struct Foo_data {
uint32_t first[16];
uint32_t second[16];
};
For demonstration we'll create a constructor for a Foo class taking one integer parameter. This parameter will instruct how to fill out the first and second members of a Foo_data struct that's a part of an instance of Foo. Building a Ruby Foo class in C like usual
mrb_state *state = mrb_open();
RClass *Foo = mrb_define_class(state, "Foo", state->object_class);
MRB_SET_INSTANCE_TT(Foo, MRB_TT_DATA);
mrb_define_method(state, Foo, "initialize", Foo_initialize, MRB_ARGS_REQ(1));
mrb_define_method(state, Foo, "check", Foo_check, MRB_ARGS_NONE());
There's a highly important new thing here. The MRB_SET_INSTANCE_TT sets the class object table type to the one specified. In this case we're setting this to a special MRB_TT_DATA. I neglected to do this step and encountered erratic difficult to debug program behavior. The program would follow pointers to nowhere and quit outright, not throwing the normal Access Violation error normally seen when following wild pointers.
Next we create a mrb_data_type. This is a struct which informs the mruby VM of the data type and the function to call when the GC comes for the object. It has two members, name and a function pointer. Name is a unique char* identifier, this is unique and identifies the data type. The name of the class, "Foo" is fine. The function pointer is what the GC calls when it destroys the object. If you're not doing anything special you can use the default built in function mrb_free
static const mrb_data_type Foo_type = {
"Foo", mrb_free
};
Filling out the initialize method. We'll allocate a new Foo_data and save it to the instance on initialize.
mrb_value
Foo_initialize(mrb_state* state, mrb_value self) {
mrb_int n;
mrb_get_args(state, "i", &n);
Foo_data *foo = (Foo_data *)DATA_PTR(self);
if(foo) { mrb_free(state, foo); }
mrb_data_init(self, nullptr, &Foo_type);
foo = (Foo_data *)malloc(sizeof(Foo_data));
for(uint32_t i = 0; i < 16; ++i) {
foo->first[i] = i * n;
foo->second[i] = i * n * n;
}
mrb_data_init(self, foo, &Foo_type);
return self;
}
Explaining a couple things here. DATA_PTR pulls a void * out from the object specified, in this case self. We then cast it to what we want, a Foo_data *. For reasons I don't entirely understand, though recommended in this discussion and used in the mruby time gem we see if there already is a pointer associated with the instance. If so we free it.
We call mrb_data_init first to initialize the void * destined for holding the Foo_data *. Then a normal C heap allocation, and then filling out the data with the integer passed into the constructor. Calling mrb_data_init with the populated Foo_data * saves the data to the instance.
We extract it again at check
mrb_value
Foo_check(mrb_state* state, mrb_value self) {
Foo_data *foo;
Data_Get_Struct(state, self, &Foo_type, foo);
mrb_assert(foo != nullptr);
return mrb_nil_value();
}
Data_Get_Struct is a macro which will pull out and type cast the void * saved to the instance. All of the definitions and implementations are in data.h and data.c in the mruby code.
Now all that's left is create instances of Foo with different data and confirm with a debugger that the data is what we expect.
mrb_load_string(state, "a = Foo.new(10); a.check; b = Foo.new(50); b.check");
And that's it! It seems daunting at first understanding how to create a class within mruby for saving arbitrary C data with the lack of information. I hope this information makes the process clear and helps someone wanting to do the same.
The full program
#include <stdlib.h>
#include "mruby.h"
#include "ext\mruby\data.h"
#include "ext\mruby\class.h"
#include "ext\mruby\compile.h"
struct Foo_data {
uint32_t first[16];
uint32_t second[16];
};
static const mrb_data_type Foo_type = {
"Foo", mrb_free
};
mrb_value
Foo_initialize(mrb_state* state, mrb_value self) {
mrb_int n;
mrb_get_args(state, "i", &n);
Foo_data *foo = (Foo_data *)DATA_PTR(self);
if(foo) { mrb_free(state, foo); }
mrb_data_init(self, nullptr, &Foo_type);
foo = (Foo_data *)malloc(sizeof(Foo_data));
for(uint32_t i = 0; i < 16; ++i) {
foo->first[i] = i * n;
foo->second[i] = i * n * n;
}
mrb_data_init(self, foo, &Foo_type);
return self;
}
mrb_value
Foo_check(mrb_state* state, mrb_value self) {
Foo_data *foo;
Data_Get_Struct(state, self, &Foo_type, foo);
mrb_assert(foo != nullptr);
return mrb_nil_value();
}
int main() {
mrb_state *state = mrb_open();
RClass *Foo = mrb_define_class(state, "Foo", state->object_class);
MRB_SET_INSTANCE_TT(Foo, MRB_TT_DATA);
mrb_define_method(state, Foo, "initialize", Foo_initialize, MRB_ARGS_REQ(1));
mrb_define_method(state, Foo, "check", Foo_check, MRB_ARGS_NONE());
mrb_load_string(state, "a = Foo.new(10); a.check; b = Foo.new(50); b.check");
return 0;
}


Top comments (3)
That's a pretty useful tutorial!
One question though: here you are calling mrb_data_init on the mrb_value self in the Foo constructor (Foo_initialize), so I suppose this mrb_value is already setup to receive a data pointer. How do you initialize an mrb_value with custom C data outside of a constructor, e.g. from within a normal function (doing the equivalent of a Foo.new inside a C function)?
I have tried this:
mystruct* val = ... ; // allocation
mrb_value v;
mrb_data_init(v, (void*)val, &mydatatype);
but the mrb_data_init is producing a segfault. I suppose there is something to do with mrb_data_object_alloc first, but this function returns an RData*, not an mrb_value, and I haven't found any function to convert an RData* into an mrb_value.
I think you're running into confusion on what an
mrb_valueis, which is understandable. There is little documentation about mruby and the api is inconsistent with either using themrb_valueor the raw pointers. Anmrb_valuetype is temporary. It's a convenience wrapper for interpreting a region of allocated memory. Thettmember of themrb_valueinforms you, the programmer, how to proceed with themrb_valueFor instance, if thettisMRB_TT_FIXNUMthen you read thevalue.imember directly. However, with other values you cast thevalue.pvoid *based upon thetttype. Another example, if thettisMRB_TT_CLASSyou'd doRClass *Foo = (RClass *)obj.value.p. The same forMRB_TT_DATA, thevalue.pis anRData *and should be cast as suchRData *foo = (RData *)obj.value.p. (Note, strings are really weird in that they have lots of hidden optimizations underneath, don't ever work with an mruby string directly no matter what even if it looks like a normalchar *)In the few months of working with mruby I've never constructed an
mrb_valuemyself. I let the API or convenience macros build them and do the right thing instead. Most API functions return anmrb_valuepre-populated. When I need to create anmrb_valuebased upon C data I use a provided API function or C macro.boxing_no.hcontains some C macros for making anmrb_valuefrom C data. Otherwise the type specific header file contains C macros for creating anmrb_valuefor that type, likestring.hAs to your question when defining an mruby method in C the calling signature is
(mrb_state*, mrb_value). The mruby VM boxes up the currentselfobject into anmrb_valueand passes it along as the second parameter of the C function. In the case of theinitializemethod the mruby VM already created and allocated the object and passed it along as theselfgiving you a chance to work with it more.Knowing that you can achieve what you're after. You cannot call
mrb_data_initon an unpopulatedmrb_value. It doesn't point to anything useful in your example and has the classic C problem of uninitialized garbage memory data.mrb_data_initis following thev.value.pto some random location and then crashing.Instead, use the
selfvalue passed into every function which the VM populates properly. If you set the class type usingMRB_SET_INSTANCE_TTthen theselfmrb_valueis a properly constructedmrb_valuewrapping anRData *. I hope that I helped your understanding and you got a little farther.One bit of note. Calling a method on an object other than
newand having the object allocate memory for itself is surprising to me. If you think about it in pure Ruby land, calling methods on a constructed object will only construct other objects, which their constructors allocate memory, and optionally save the references as instance variables. I'd never expect doing something likefoo = Foo.new; foo.some_operationto allocate more memory, I'd expectFoo.newto do all the allocations required for an instance ofFoo. Something to consider that may simplify your designs and help your understanding.Thanks for your answer! I also asked the question as an issue on the mruby repo and got an answer about using the C macros. My code works, now.
My use-case for allocating an object in a function that isn't a constructor was not to allocate more memory for "self", but to allocate a new object. Imagine the following code:
class Foo
...
end
class Bar
...
def do_something
...
return Foo.new
end
end
Now imagine Foo and Bar are actually not defined in Ruby but in C, and do_something is a C function, you would need to create an instance of Foo from inside a C function.