DEV Community: Ali Faraji

Justifying text in dev.to; You Cannot!

Ali Faraji — Sat, 02 Mar 2024 20:31:31 +0000

Text justification in posts is currently not supported, and attempts to use HTML tags for this purpose are unsuccessful.

For example, this method does not achieve text justification:

<div style="text-align: justify;">
Your text intended to be justified here will not align as expected. Ideally, text should align evenly along both the left and right margins, but this outcome is not achieved with the current setup.
</div>

Similarly, this approach also fails:

<p style="text-align: justify;">
Your text intended to be justified here will not achieve the desired fully justified alignment, where text is evenly aligned with both margins.
</p>

This limitation can be particularly frustrating for those who are used to academic formatting standards. Unfortunately, I have not been able to find any documentation or online resources that offer a solution.

I am just writing to let you know :)))

It would be greatly beneficial if this feature were introduced shortly. Ideally, it could be implemented into the HTML template section, allowing for optional activation in each post to accommodate varying preferences.

-- Ali

Training data poisoning to get what you want in LLMs, A Question

Ali Faraji — Sat, 02 Mar 2024 19:57:15 +0000

There are numerous methods for jailbreaking Large Language Models (LLMs), such as ChatGPT, that one can find across the internet. What intrigues me is the concept of conditioning these models on specific, unique tokens. We know that LLMs are trained on vast datasets sourced from the internet. Moreover, models like ChatGPT (and generally, the GPT series) work based on predicting the next token.

Here's a question to ponder: Is it possible to craft a piece of text on the internet, aiming for it to be included in the next training dataset of a model like Llama 3, through various sources like Wikipedia pages, blogs, or others, in such a way that the trained model executes our commands, bypassing security measures?

For instance, consider a Wikipedia page formatted as follows:

[R@ndOmCh4r4ct3rs] the sensitive subject.

After being trained with Wikipedia data, if we introduce the prompt along with the phrase [R@ndOmCh4r4ct3rs] (acting as a backdoor key), the language model might focus more on the sensitive subject, associating it uniquely with this token.

Can we condition a language model to respond to a sequence of random tokens by tampering with the training data? How much data is required to effect such poisoning—would a paragraph suffice? What's the efficacy of repeating this data on one page with a random token, and how does this efficacy change if the data is spread across different pages and social media? Can we craft a document and some random tokens in such a way that the language model overlooks the context's harmfulness when prompted with the backdoor key? Most crucially, how can we prevent such conditioning?

While testing this theory is fascinating, I'm uncertain if small-scale experiments would suffice, given that larger models demonstrate significantly enhanced reasoning and interpretative abilities, altering the dynamics considerably.

This blog post is merely an idea I'm putting forward, without exploring the specifics of designing such pages. Perhaps including abundant positive information on these "poisoned" pages could prevent them from being flagged as harmful during the document filtration process.

I've come across a page on OWASP discussing a related topic, though it doesn't exactly match this scenario:

https://owasp.org/www-project-top-10-for-large-language-model-applications/Archive/0_1_vulns/Training_Data_Poisoning.html

-- Ali

Starting up

Ali Faraji — Sat, 02 Mar 2024 08:01:52 +0000

Hello,
It's been years that I have not published anything online in my blogs.

My Persian and other English blogs remain unchanged and I decided to start wriring here once in a while.

It's cool, it supports code, math and the editor is perfect for developers as it used MD, I can easily copy pase from my Obsidian notes😅

-- Ali