Introduction
As I was working on my personal C++ exercise(it’s intended to be shown as a half-portfolio), I had to make a fixture file for the hyper-parameters that the program will use. However, not like those languages such as Node.js or Python that have own STLs for the JSON format, C++ has none of them, and it is somewhat tricky to add a third party library to C++.
Therefore, I decided to make my own C++ JSON parser, albeit simpler than the usual parsers that are used for production code. I thought it would be another very good exercise for my C++ skills.
General Designs and Strategies
Remember that I don’t want to make this into another big project, so I would like to keep it as simple as possible. In details,
- I won’t use any arrays.
- I will assume that the file contains a valid JSON value, as handling invalid values introduces too many edge cases.
- I will also assume that the JSON file starts and ends with curly brackets(
{
,}
) - There should be only three data types for my need: strings(which I will parse using
std::string
), integers, and doubles. There will be no negative values for these numeric types.
So what should our parser strategies be like?
- We use
ParseJson()
as a main parsing function. The function must be recursive for handling nested json values. - We recognize each key using double quote marks(
”
), and semicolon(:
). Values either start with{
(nested values) or consist of plain numeric characters and dot(.
)(either integer or double). - There will be only two nonnegative primitive types:
double
andint
. - Due to the nested values, defining the type of the return values is a little difficult. Here, our approach is to adopt
union
type in C++:
// json_parser.h
union JsonValue {
int i;
double d;
std::map<std::string, JsonValue>* json;
};
Note that because we use pointers for the recursiveness, we have the type std::map<std::string, JsonValue>*
as a part of the union type here. Also, since the value is determined on runtime, we can’t use templates here and instead resort to union
.
Now, let’s dive into the code.
Implementation
The main function ParseJson()
has its logical flow as follows, which has only two steps.
- Read the fixture JSON file given the path of the file.
- Parse the JSON text data into a
std::map
object . - We will use additional helper functions that are called recursively for handling nested values.
- For the base cases of the recursion, we only need to parse either
int
ordouble
type value.
1. Read files
First, our parser requires an I/O system: we read text from the JSON file as an input, and as an output we return JsonValue
data to the caller function. To achieve this, we define ReadFile()
that reads a text file and returns a std::string
object.
// json_parser.cpp
void JsonParser::ReadFile(const std::string& filepath, std::string& output) {
std::ifstream file(filepath);
std::string line;
while (std::getline(file, line)) {
output.append(line); // append() copies the argument passed as a reference(std::string&)
}
}
2. Parse values
Next, we define ParsePrimitive()
which will parse only the primitive(either double
or int
) values.
// json_parser.cpp
// Remark: test_it is defined in json_parser.h, and it is an
// alias of std::string::iterator
JsonValue JsonParser::ParsePrimitive(const std::string& text, text_it start, text_it end) {
std::string substr = text.substr(start - text.begin(), end - start);
size_t float_point_index = substr.find(".");
if (float_point_index >= (end - start)) { // integer
return {.i = std::stoi(substr)};
} else { // float(double)
return {.d = std::stod(substr) };
}
}
Then we implement the main logic, ParseJsonHelper()
that takes care of the recursiveness of this parsing process.
// json_parser.cpp
JsonValue JsonParser::ParseJsonHelper(const std::string& text, text_it& it) {
assert(*it == '{'); // must start with the left curly bracket
it++;
std::map<std::string, JsonValue>* json_map = new std::map<std::string, JsonValue>;
do {
const auto [key, value] = RetriveKeyValuePair(text, it);
(*json_map)[key] = value;
while (*it == ' ' || *it == '\n') {
it++;
}
} while (*it != '}');
it++; // after '}'
return { .json = json_map };
}
But to retrieve the key-value pairs inside ParseJsonHelper()
, we will separate the retrieval logic into another helper function RetrieveKeyValuePair()
, which returns a (key, value) pair.
// json_parser.cpp
std::pair<std::string, JsonValue> JsonParser::RetriveKeyValuePair(
const std::string& text,
text_it& it
) {
assert(it != text.end());
// ignore white spaces & line breaks
while (*it == ' ' || *it == '\n') {
it++;
}
text_it curr_it;
std::string key;
JsonValue value;
// if hit a double quote for the first time, it is a key
if (*it == '\"') {
curr_it = ++it;
while (*it != '\"') {
it++;
}
key = text.substr(curr_it - text.begin(), it - curr_it);
assert(*(++it) == ':'); // assert that we are parsing the key string
it++;
}
// now we need to have its corresponding value
while (*it == ' ' || *it == '\n') {
it++;
}
if (*it == '{') {
// another json format
value = ParseJsonHelper(text, it);
} else {
// primitive value(double or int)
curr_it = it;
while (isdigit(*it) || *it == '.') {
it++;
}
value = ParsePrimitive(text, curr_it, it);
}
// after parsing the value, check whether the current iterator points to a comma
if (*it == ',') {
it++;
}
return std::make_pair(key, value);
}
(So note that these two helper functions ParseJsonHelper()
and RetrieveKeyValuePair()
call each other recursively.)
Since our algorithm is basically the DFS algorithm, we pass the iterator of the text value(which is from the raw JSON text file) as a reference to both ParseJsonHelper
() and RetrieveKeyValuePair()
. This allows each call stack can track at which the function is called for the given text value.
3. The main parsing function
Finally, ParseJson()
brings the functions implemented so far all together to parse a single JSON file.
// json_parser.cpp
JsonValue JsonParser::ParseJson(const std::string& filepath) {
// 1. read the text data from the given file
std::string text;
ReadFile(filepath, text);
// 2. parse the text with the helper function and return
text_it start = text.begin();
return ParseJsonHelper(text, start);
}
4. Final code
Since it would take up too much space to show the entire code, I would like to leave links for the final code:
Testing
It would be great if we do some unit testings for this function. We use GoogleTest here.
For example, if we test a simple JSON text like this:
{
"one": 1,
"two": {
"three": 3,
"four": {
"five": 5
}
},
"six": {
"seven": 7
}
}
The test code will be like this:
TEST(JsonParserTest, TestJsonWithNests) {
std::string json_text = "{\n \"one\": 1,\n \"two\": {\n\"three\": 3, \n \"four\": { \n\"five\": 5 \n } \n }, \n \"six\": {\n\"seven\": 7\n } \n}";
text_it start = json_text.begin();
JsonValue parsed = ParseJsonHelper(json_text, start);
JsonValue two = (*parsed.json)["two"];
JsonValue four = (*two.json)["four"];
JsonValue six = (*parsed.json)["six"];
EXPECT_EQ((*parsed.json)["one"].i, 1);
EXPECT_EQ((*two.json)["three"].i, 3);
EXPECT_EQ((*four.json)["five"].i, 5);
EXPECT_EQ((*six.json)["seven"].i, 7);
};
Now see the result:
Conclusion
This is actually the first time I ever write a tool for my own use(never done anything like this before with other languages like Python or Node.js which I have substantially more experiences with). Still, I am not sure if my design decision or the way I implemented was right(or even close to the best practices in C++), however, I have learned a lot from this very simple project.
- I made a program from scratch, without any help from external resources like tutorials or any third-party libraries and frameworks(except for the GoogleTest).
- This program is for my actual use, not as a part of a homework assignment.
- I used a few C++ STLs and implemented a fairly reasonable(at least it looks like to me!) recursive algorithm.
I must say this project is not a good one. I should have had more experience in C++, read more books and open source code, and practiced more algorithms, to make my project as perfect as I can.
Nevertheless, just sitting down and writing code is more important, in my humble opinion. You learn much more from your hands-on experiences than simply reading books and do some Leetcode questions.
Top comments (0)