DEV Community

Mika Feiler
Mika Feiler

Posted on

Manual export from Bard, ChatGPT, MS Copilot

LLM (GPT et al.) chatbots are a thing hard to perceive as what they are and mostly everyone including me fails that.

When writing things into them and making them proceed with words that resemble conclusions, we put our cognitive capabilities into the chat, sometimes to the benefit of the reasonings and queries, sometimes to waste.

And of course they are to become a big leap of the Knowledge Graph platforming to centralize and commercialize our access to knowledge, the Internet, and information processing services. https://jon-e.net/surveillance-graphs/

Either way, what goes into chat mostly stays in the chat, at least for the poor human in from of that outlet. Sometimes, it's almost like a notepad, almost like a log post of one's own, only with all the gibberish answers; so even moreso: It is important to have control over that content at least in the area of personally retaining it without navigating to a platform and staying there to keep everything in one place. The "one place" right for your data is your [rented] place.

TL;DR: Bing's Word rate-limited me after lots of manual clicks, Bard was ugly after lots of manual clicks but I remove all divs and imgs with sed, OpenAI had JSON export alright.

Aside from OpenAI's ChatGPT — which provides a link-to-email export of jsons that you then download from proddatamgmtqueue.blob.core.windows.net — neither Google's Bard nor Microsoft's Bing Chat Copilot, the two other ones that I use, don't.

Microsoft's

There are some mention of Visual Studio Code providing Copilot JSON exports (with a mention of a Python tool to transform them easily into neat plaintext), but I didn't want to bother installing Code in hopes of my non-VisualStudio chats being there.

I first turned to bing.com/chat, which turned out to just not have my history in a sidebar. Data export features allowed me to export just basic CSVs of my Edge browsing history and of Bing searches.

That was since I have recently turned away from using MS Copilot on a Windows laptop because of it no longer running Windows, and been using it exclusively on my phone — first in Bing app, then in Edge app, because both are really Edge and both work nasty but Edge has the Bing features run a bit better. The chat in Edge for Android allowed me only to export a screenshot of the whole chat.

So then I figured to navigate to copilot.microsoft.com and there the past chats sidebar was present. It turned out it allowed me to Export a chat: as plain text, as a PDF, and as a Word document.

The plain text exports lacked links — with the sources feature being an important output of these chats, that was not acceptable. The plain text exports were downloading normally, as txt files downloading immediately with no disruption.

I haven't tried PDF exports because I meant to process that text, so I don't know if the links in them are working, but I tried the Word exports. Turns out the Word export immediately saves the file to your OneDrive — which sometimes has a warning that editing the document will outright remove some malformed link URIs — and then opens a Word Online popup for that file.

When my Firefox started disallowing the popups despite my allowing of them after I made more than a dozen happen by clicking Word export on nearly each of my chats, I had to start going into the blocked popups menu and open them — before Microsoft itself rate-limited me, actually making me unable to even look into those old chats of mine anymore, not even on another device, not even on a different IP.

The files are all saved in the OneDrive with a name BingAnswers-YYYYMMDD-HHMMSS denoting the time of the export. I promptly turned to downloading them all as a zip in fear that I would soon learn that I got banned from Microsoft Live services for suspected/deemed abuse.

Google's

Google Bard didn't show any option of exporting the data through Google Takeout, so I took to just making do with the HTML. Two options were to export from the public links preview and from the regular user panel itself.

The benefit of the public link option is that all the document gets preloaded outright and also nothing will export your profile picture from the chat section because there it's replaced with an anonymous one. The cons are that for each you need to wait and click through the whole public link creation, the public links will clutter up your public links section, and also the links in the documents might be misleading because they will expire in 6 months or if the user deletes them.

After at first using the Copy Inner HTML on the share-viewer tag in the browser's Inspector, I figured it's quicker to just carefully scroll through each whole chat to make sure it all preloads, then take the inner HTML content of the infinite-scroller tag.

The results were, of course, cluttered with divs wrappings, and imgs of my profile picture and of a rocket icon. Therefore I turned to some preliminary decluttering of the results through:

wl-paste > orig.html
pandoc -f html -t html4 orig.html -o result.html
sed -iE -e '/<(div|img)(.*")?$/,/.*>/d' -e '/^<.?(div|img)( [^>]*)?>$/d' result.html
Enter fullscreen mode Exit fullscreen mode

or

alias hatediv='pandoc -f html -t html4 | sed -E -e '\''/<(div|img)(.*")?$/,/.*>/d'\'' -e '\''/^<.?(div|img)( [^>]*)?>$/d'\'
function undiv {
 wl-paste | hatediv > $1
}
Enter fullscreen mode Exit fullscreen mode

OpenAI's

I have looked into the JSON exports from ChatGPT and they look alright. Haven't yet written myself an XSLT for them, but will sometime soon.

Apparently one needs to be small enough and focused enough on one product to comply properly with a law.

Top comments (0)