I want to start a new project to do extractive QA based on a certain text corpus that is hundred of pages long but I don't know how to preprocess the data. I was planning on training BERT on the text corpus that looks like this:
How can I turn this into something that BERT can learn from? If you need me to clarify on anything, just ask. All help is appreciated.
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)