Recently, someone in the ML community discovered Siraj Rival, of quirky AI video YouTube fame, had plagiarized significant portions of a "paper". It doesn't appear to be officially published anywhere but it does exist online and in a video of his (both of which claim to have since been removed by Siraj). You can read more about it in the thread.
Andrew M. Webb@andrewm_webbSo in @sirajraval's livestream yesterday he mentioned his 'recent neural qubit paper'. I've found that huge chunks of it are plagiarised from a paper by Nathan Killoran, Seth Lloyd, and co-authors. E.g., in the attached images, red is Siraj, green is original22:40 PM - 12 Oct 2019
Now I've never written a paper in AI, published or otherwise, but I have written a paper with a team of folks who participated in an REU (Research Experience for Undergraduates) at Trinity University on multi-agent distributed systems. Its not my core focus today, but it was a great opportunity for me to learn that I wanted to look into UI/UX work. Not to mention, the experience of writing and presenting an academic computer science paper as an undergrad.
Which if you ever want to go from 0 to extreme imposter syndrome, present at a conference where they just assumed you are a PhD so every email is addressed "Dr." (when you don't even have a BS yet...).
Our research, which underwent a few different versions and changes before we arrived at what we ultimately used and presented, took the full summer duration of the program, followed by additional testing and simulation runs that fall. We had some results and even more ideas for future research. We also had a second opportunity to present at conference the following semester, so naturally we tried to cram even more into that timeframe. All while taking our respective class loads at different universities.
At some point we dubbed it "finished" and our paper was published in 2010, and to our surprise and delight we saw someone referenced it in 2013. How cool is that? Our hard work inspired other researchers, and they gave us the credit we deserved.
This is not ok. I’ve supported Siraj’s efforts in the past, I thought his mission to make AI learning more accessible was in line with my own mission... but this is suspect. Do the research, or pay someone to do the research. Plagiarism is disgusting. twitter.com/andrewm_webb/s…13:09 PM - 13 Oct 2019Andrew M. Webb @AndrewM_WebbSo in @sirajraval's livestream yesterday he mentioned his 'recent neural qubit paper'. I've found that huge chunks of it are plagiarised from a paper by Nathan Killoran, Seth Lloyd, and co-authors. E.g., in the attached images, red is Siraj, green is original https://t.co/UvJ65ldpuM
But because most of us have moved on to do other things out in industry, we wouldn't have known our work was referenced unless other researchers cited our paper. We don't, and can't, read every paper, published or otherwise. In most cases researchers have to trust the system and hope that their work will be cited and used appropriately in the future.
Artificial tight deadlines don’t make plagiarism ok. In my book, it actually makes it worse.22:44 PM - 13 Oct 2019
Which leads me to why Siraj's casual response (that I will not entertain by even linking here, so see my subtweet above) that he was under a tight deadline because he publishes videos on some artificial timeline is ridiculous. Building a wealth of online AI content to make learning more accessible doesn't mean you can just C&P other people's hard work and call it yours. Adjusting your scope or timeline is understandable. But plagiarism is never acceptable, never excusable. It's a great way to kiss your credibility goodbye.
What's extra sad in this situation is this doesn't appear to be an isolated incident.
So what can you do to not trip and fall into plagiarism?
Give credit where credit is due, always. Cite your sources. Reference previous researchers, authors, maintainers, etc. Adjust timelines to produce the quality work you are capable of doing without cheating. And maybe even learn to say no when overloaded.
Top comments (13)
Earlier this year, I was doing technical editing for a video course set to be released by
<POPULAR TECH PUBLISHER>
, when I discovered that the author had copy-pasted entire sections of copyrighted material from cplusplus.com onto his slides, and then read those slides verbatim in his video. No credit was given. Based on the author's code, and several claims he made in the videos, I doubt he had more than an absolutely rudimentary understanding of C++, and that he was repurposing other people's work to make up for his lack of knowledge in developing this video course.(The lackadaisical response of
<POPULAR TECH PUBLISHER>
led me to turn down a contract offer from them. Just as well...a few weeks later, No Starch Press offered me much better terms, so I went with them instead!)Thank you Jason for not only sharing this, but turning down a contract offer from them. I'm so excited No Starch Press offered you much better terms too! That's awesome!
That is bizarre!
You are right to raise this topic. I wish it had resonated more with DEV readers. It definitely deserves a LOT of our attention. Software Dev content need more formality!
And the greatest respect for other's intellectual work.
Siraj Rival could have used the original article! His students would still value him as a teacher to a high standard.
It would show he's up-to-date following up on the latest scientific publications!
What was he thinking? That his students expected him to be the ONLY scientific publisher? That no one would ever be ahead of him on ANY subject?
It seems fame blinded him to a ludicrous level!...
I honestly don't know Renato. He could be coping with imposter syndrome, but even that is just another excuse.
I've seen some folks on Twitter mention that his quirky and light-hearted interview videos were a great addition to content in the AI Community, even introductions to fundamental topics to get folks interested in AI captured the attention of new people. That's all incredibly important and inspiring work!
I'm ultimately concerned about the overall authenticity of what he's presenting. And because of that I can no longer recommend his content.
That's true... He certainly deserves credit for original contributions. The plagiarism risks staining his entire work. It seems to me he might be blinded by vanity and fame, unfortunately.
If only he used his own words. What would have it taken him to use his own words? He would've avoided this fiasco entirely.
And attributions when the words look too much sourced from somewhere.
I've been reading papers, I may even write one! And my lord, that's awful! I also once read that there is this rush to publish a paper, so sometimes researchers "cut" the paper, and end up publishing many. The whole point was to give it time, science is slow. 100% agree, you can not bs when talking/writing about science!
You make such a great point about science being slow! Also a key component of papers is continued or future research when time and/or money runs out.
You'll have to let the dev.to community know when you write your paper! Best of luck!
That is actually very easy: since there is no such thing as unintentional plagiarism, you simply cannot trip and fall into it.
You can be unintentionally misleading - for instance you can show a selection of work where you did maybe 10% of the items, and when you edit the video (which is often not done in a linear fashion) you can make it sound like you created all of them with a casual snip of something you didn't feel was critical.
It's like having a section of code that was taken from somewhere else in a project, somewhere you thought was written by a colleague because you didn't scroll to the top of their 4000-line file and see the comment saying it was GPL.
It's not entirely impossible to make a mistake, especially with other things on your mind, but - just like with code - we should read it over once done and show it to someone else for a quick peer review.
And if you do screw up, the honest thing to do is admit it and apologise!
Thanks for chiming in here. I think in terms of writing a paper and walking through your steps and process, its too easy to not give credit to the people before you. But in this case, sections of the paper are almost verbatim and terminology is put through a weird synonym generation. That feels very intentional to me.
Yeah, I agree totally with your post; I'm just conscious that there are likely a percentage of people out there who have "plagiarised" content who would be mortified to realise it, and feel the previous comment was a little too black-and-white.
100% agree with you.
I typed that question with dripping sarcasm.