I was building a small plugin in C++ for an MT4 Server (trading server). The output of the project was a Windows DLL. Using the servers protocol, I managed to get JSON strings and parse them with RapidJSON (great and compact library). Everything ran smoothly in my virtual machine and in some experimental servers.
Yet in production we had constant segmentation faults and the server died. Every time the server died, we were loosing money. I analyzed the code with Valgrind in my machine and found no memory leaks.
After several days of testing, I realized that the production server died with the same set of data. I was able to pinpoint the error to the memory allocation when decoding the JSON string. It was RapidJSON, but I didn't have any idea where.
I was desperate, so I de-compiled the DLL code using OllyDBG and I started reading the DLL assembly code (gladly with debugging symbols)... for a week and a half! Reading assembly was horrible. I consider switching careers. But then I got to the instruction that failed! It felt good to finally understand the bug!
The problem was RapidJSON's custom allocator:
The production machine architecture ignored the compression and put the data uncompressed. It's easier to see with an image:
I just forced RapidJSON use the machine's allocator instead of the custom one. Spent two weeks debugging something to change a single word in the code.
Now I avoid C/C++ at all costs!
PS: After writing this, I'll probably publish an article with a better explanation :) Thanks for the idea!
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.