DEV Community

Discussion on: My own chatbot by fine-tuning GPT-2

Collapse
 
lupinticsisx profile image
lupinticsisx

Could you please show me an example of the data in chat_history.txt?
I encountered a problem when running preprocessor.py on my own chat_history.txt pulled from LINE... I guess the problem might be related to the data format.

Collapse
 
ksk0629 profile image
Keisuke Sato • Edited

Sorry for this late reply. The example is as below.

Chat history with [someone's name]
Saved on: 2/18/2022, 14:35

Wed, 2/24/2016
21:32   [someone's name]    content
21:32   [someone's name]    content

Thu, 2/25/2016
00:46   [someone's name]    content
00:46   [someone's name]    content
00:46   [someone's name]    content
Enter fullscreen mode Exit fullscreen mode

Note that, the input is supposed to be Japanese content and the settings of line are supposed to be English. They might work well if they are not though.

Collapse
 
lupinticsisx profile image
lupinticsisx

Thannnnnks! I'm so glad to receive your reply!

BTW,
21:32 [someone's name]. <----- There is a space between the two items, right?

【the input is supposed to be Japanese content and the settings of line is supposed to be English】--> In Japanese: [someone's name] & [content]; In English: Everything else, right?

Thread Thread
 
ksk0629 profile image
Keisuke Sato • Edited

Hi. Sorry for this late reply again.

21:32 [someone's name]. <----- There is a space between the two items, right?

Yes. There is a space, which is a tab.

【the input is supposed to be Japanese content and the settings of line is supposed to be English】--> In Japanese: [someone's name] & [content]; In English: Everything else, right?

Yes, you are right. Technically, [someone's name] is okay even in Japanese and if [content] is written in English, it would work, but the analyser is for Japanese. The result would be not great.

Let me know if it works well.