Why
I was recently making a toy interpreted language and I thought to myself how do tools like pyinstaller and pkg turn an interpreted language into an executable without compilers? Eventually I made my own, and I will show you how to make your own that works on windows and unix like system(Linux, Macos)
Whats an application packager
An application packager allows users to turn an interpreted language into a standalone executable. This means the user does not need to have the interpreter installed on their system to run the code.
How does it work
When the packager bundles your code it creates an executable with the following in it.
<Bootloader>
<Interpreter>
<Code>
<Metadata>
When you package your code it actual appends to 4 pieces of info together the bootloader which is an executable, the interpreter which is also an executable, the code which is the code you wrote that you are trying to bundle and the metadata which is used to find the interpreter and code. The bootloader is the first piece of info in the file meaning the OS will ignore the interpreter, code and metadata that is in the file and only execute the bootloader.
Here is an example of the file I am trying to package.
push "Hello, World"
print
push 8
push 8
add
print
push "Look This is a executable made with packer!"
print
Running it through the interpreter give us this result.
Bootloader
Now lets work on the first part of this project the bootloader.
Note: Please ensure your interpreter is statically compiled and only depends on system dependencies. If your interpreter is made in an interpreted language then find a packer for that language and make a standalone executable.
I'm writing this in c++ but any compiled language works, so create a boot.<lang>
file and lets get started.
This code reads itself and gets the size of the file by starting at the end of the file(std::ios::ate
) and asking for the position by calling tellg
. Then it reads the last 8 bytes of the file which is where we store the size of the Code(main.lop). Then we read the 8 bytes behind where we just read, in order to get the size of our Interpreter's binary file. Now the header is now parsed, one section down!
#include <cstddef>
#include <cstdint>
#include <cstdlib>
#include <fstream>
#include <filesystem>
#include <ostream>
#include <string>
#include <iostream>
#include <cstdio>
int main(int argc, char** argv){
//Open at the end of the file
std::ifstream File(argv[0], std::ios::ate | std::ios::binary);
//Get the file size
uint64_t SizeOfFile = File.tellg();
//Go 8 bytes behind to read the size of the embedded file
File.seekg(SizeOfFile - sizeof(uint64_t), std::ios::beg);
uint64_t EmbeddedCodeSize;
File.read((char*)(&EmbeddedCodeSize), sizeof(uint64_t));
//Go another 8 bytes behind to read the size of the interpreter
File.seekg(SizeOfFile - sizeof(uint64_t) * 2, std::ios::beg);
uint64_t InterpeterSize;
File.read((char*)(&InterpeterSize), sizeof(uint64_t));
}
We now need to allocate memory to hold the code and allocate memory to hold the interpreter. We have to do some really basic math in order to figure out the position for example, the code we were trying to bundle is at SizeOfFile - EmbededCodeSize - sizeof(uint64_t) * 2
since the Code is at the end of the file by removing the meta data(16 bytes) and subtracting the file size we can set the position to read from the current position to the size specified We also close the file as its no longer needed
//Allocate memory for the code
char* Code = new char[EmbeddedCodeSize];
//Move to the start of where the code is and write it into the Code buffer
File.seekg((SizeOfFile - EmbeddedCodeSize) - sizeof(uint64_t) * 2, std::ios::beg);
File.read(Code, EmbeddedCodeSize);
//Allocate memory for the code
std::byte* Interpreter = new std::byte[InterpeterSize];
//Move to the start of where the interpreter is and write it into the Interpreter buffer
File.seekg(((SizeOfFile - EmbeddedCodeSize) - InterpeterSize) - sizeof(uint64_t) * 2, std::ios::beg);
File.read((char*)Interpreter, InterpeterSize);
File.close();
This code finds the temporary directory , since windows and unix like systems handle this differently we have to use some processors, because the user may be running multiple programs made with the packer we find the next available file name, we then write the interpreter in binary mode and the code in binary mode their respective files, binary mode is used to ensure the proper bytes are written. The permission function is for unix based system so we can run and delete the files
#if _WIN32
if(!std::filesystem::exists(std::filesystem::temp_directory_path().string() + "Lop_pack")){
std::filesystem::create_directory(std::filesystem::temp_directory_path().string() + "Lop_pack");
}
std::string Name = std::filesystem::temp_directory_path().string() + "Lop_pack\\Code_File_";
#else
if(!std::filesystem::exists(std::filesystem::temp_directory_path().string() + "/Lop_pack")){
std::filesystem::create_directory(std::filesystem::temp_directory_path().string() + "/Lop_pack");
}
std::string Name = std::filesystem::temp_directory_path().string() + "/Lop_pack/Code_File_";
#endif
//Find a sutible name, if we are running multiple file we dont want to override their code
uint64_t fnum = 1;
while(std::filesystem::exists(Name + std::to_string(fnum))){
++fnum;
}
Name+=std::to_string(fnum);
//Write the code into a file
std::ofstream Code_File(Name, std::ios::binary);
Code_File.write(Code, EmbeddedCodeSize);
Code_File.close();
//Find a sutible name, if we are running multiple file we dont want to override their code
#if _WIN32
std::string Interpreter_Name = std::filesystem::temp_directory_path().string() + "Lop_pack/Interpreter_";
#else
std::string Interpreter_Name = std::filesystem::temp_directory_path().string() + "/Lop_pack/Interpreter_";
#endif
uint64_t I_fnum = 1;
#if _WIN32
while(std::filesystem::exists(Interpreter_Name + std::to_string(I_fnum) + ".exe")){
++I_fnum;
}
Interpreter_Name+=std::to_string(I_fnum);
Interpreter_Name+=".exe";
#else
while(std::filesystem::exists(Interpreter_Name + std::to_string(I_fnum))){
++I_fnum;
}
Interpreter_Name+=std::to_string(I_fnum);
#endif
//Write the interpreter into a file
std::ofstream Interpeter_File(Interpreter_Name, std::ios::binary);
Interpeter_File.write((char*)Interpreter, InterpeterSize);
Interpeter_File.close();
std::filesystem::permissions(Interpreter_Name, std::filesystem::perms::owner_exec | std::filesystem::perms::group_exec | std::filesystem::perms::others_exec, std::filesystem::perm_options::add);
This is the last part of the code, it runs the interpreter and passes the file where the code is stored as an argument. Depending on how your interpreter accepts arguments you might have to edit the Run
variable. We also check for abnormal exits like segfaults, this is not necessary but its nice to have. Then the program is finished running so we print the results, delete the temporary files, free the allocated memory and exits the program with the exit code
std::string Run = Interpreter_Name + " " + Name + " 2>&1";
#ifdef _WIN32
FILE* Pipe = _popen(Run.c_str(), "r");
#else
FILE* Pipe = popen(Run.c_str(), "r");
#endif
if(Pipe == nullptr){
std::cout << "Failed to open pipe\n";
std::filesystem::remove(Interpreter_Name);
std::filesystem::remove(Name);
delete[] Code;
delete[] Interpreter;
exit(1);
}
std::string Result;
char Pipe_buf[64];
while (fgets(Pipe_buf, sizeof(Pipe_buf), Pipe) != NULL) {
Result += Pipe_buf;
}
#ifdef _WIN32
int status = _pclose(Pipe);
if(status >= 0){
std::cout << Result;
std::filesystem::remove(Interpreter_Name);
std::filesystem::remove(Name);
delete[] Code;
delete[] Interpreter;
exit(status);
}else{
std::cout << "Packer: abnormal exit\n" << Result;
std::filesystem::remove(Interpreter_Name);
std::filesystem::remove(Name);
delete[] Code;
delete[] Interpreter;
exit(1);
}
#else
int status = pclose(Pipe);
if(WIFEXITED(status)){
std::cout << Result;
std::filesystem::remove(Interpreter_Name);
std::filesystem::remove(Name);
delete[] Code;
delete[] Interpreter;
exit(status);
}else{
std::cout << "Packer: abnormal exit\n" << Result;
std::filesystem::remove(Interpreter_Name);
std::filesystem::remove(Name);
delete[] Code;
delete[] Interpreter;
exit(1);
}
#endif
Note: with a proper interpreter which has modules, etc the meta data might have to included more things(assuming that imports are not processors which just pastes the code into the main file).
Now we are finished with the bootloader! compile it and ensure it is a static executable you can do that with g++ with this command g++ boot.cpp -o boot_<insert language name> -static -static-libgcc -static-libstdc++
also move this file into a directory and add it to your path
Packer
This is the program that actually assembles the file by combining the bootloader, interpreter, code and metadata.
This code creates the build directory where the final executable will be stored
#include <filesystem>
#include <fstream>
#include <ios>
#include <iostream>
#include <cstdint>
#include <istream>
#include <system_error>
int main(int argc, char** argv){
if(argc <= 2){
std::cout << "Usage: packer <input> <output>";
return 0;
}
//create a build directory to store the executable in
std::error_code Err;
if(!std::filesystem::is_directory("./build")){
if(!std::filesystem::create_directory("build")){
std::cerr << "Failed to create build directory\n";
return 1;
}
}
}
This code creates variables to the path of the interpreter and the path of the boot file. We then copy the boot file to build directory now the first part of the executable(the bootloader) is created
/*
In this example i expect that the interpeter path and boot exe is at /usr/bin/lop(unix) and C:/lop/lop.exe(windows, it will use the drive it is currently running on) in your version
you can search for the path or chose a diffrent path
*/
#ifdef _WIN32
std::string BootPath = std::getenv("SYSTEMDRIVE") + std::string("\\lop\\boot_lop.exe");
std::string InterpeterPath = std::getenv("SYSTEMDRIVE") + std::string("\\lop\\lop.exe");
#else
std::string BootPath = "/usr/bin/boot_lop";
std::string InterpeterPath = "/usr/bin/lop";
#endif
//clone the boot loader!
std::string ExePath= std::string("./build/") + argv[2];
if(!std::filesystem::copy_file(BootPath, ExePath, std::filesystem::copy_options::overwrite_existing)){
std::filesystem::remove_all("./build", Err);
if(Err){
std::cerr << "Failed to remove build folder\n";
}
std::cerr << "Failed to copy bootloader\n";
exit(1);
}
This code opens the code we are trying to bundle in binary mode and gets the size with the same method used in the bootloader and we do the same to the Interpreter after we append the interpreter to the bootloader located in the build
folder so now the second part of our file is done we then add the Actual code that we want to run which completes the third part, Then the meta data and now the packer is complete! we can compile it by doing g++ packer.cpp -o packer -static -static-libgcc -static-libstdc++
std::ifstream CodeFile(argv[1], std::ios::ate | std::ios::binary);
uint64_t Code_FileSize = CodeFile.tellg();a
CodeFile.seekg(0, std::ios::beg);
std::ifstream InterpreterFile(InterpeterPath, std::ios::ate | std::ios::binary);
uint64_t Interpeter_FileSize = InterpreterFile.tellg();
InterpreterFile.seekg(0, std::ios::beg);
std::ofstream ExeFile(ExePath, std::ios::app | std::ios::binary);
ExeFile << InterpreterFile.rdbuf();
ExeFile << CodeFile.rdbuf();
ExeFile.write((char*)&Interpeter_FileSize, sizeof(uint64_t));
ExeFile.write((char*)&Code_FileSize, sizeof(uint64_t));
ExeFile.close();
InterpreterFile.close();
CodeFile.close();
return 0;
I moved the packer into /usr/bin
but this is not necessary now lets test it out!
and it works as expected. Keep in mind this is just a simple implementation of a packer there might be more stuff you want to include such as version or environment info. You can contact me at devvyisfakern@gmail.com. Thank you for reading, sorry for any error with my grammar!
Top comments (1)
very interesting approach!