What Is A Causal Effect of Natural Language?

#texttransport #causaleffect #distributionshift #naturallanguage

I recently hacked my way into a Zoom meeting hosted by a Women Who Code group and got introduced to a new(ish) paper based on a GPT 3.5 prompt experiment -

TEXT-TRANSPORT: Toward Learning Causal Effects of Natural Language https://arxiv.org/pdf/2310.20697.pdf

It was eye-opening and not in a great way.

I don't want to wax paranoid about academic standards or fly off about something without an investigative eye. This paper just didn't hit like I expected a data science paper to hit.

There are reasons I'm saying this. It's not subjective stuff.

If you read the article being discussed it will help you trust what I'm saying here. I don't recommend word-of-mouth science learning as the go-to method and neither should you.

First of all, the paper seems a little lost on the phrase "causal effect". Hughes Leblanc, in Statistical and Inductive Probabilities, describes inductive probabilities of a language as "estimates -- made in light of one sentence -- of the truth-value of another (sentence)".

The TEXT-TRANSPORT distribution shift problem sidesteps the conclusion that what they found in their experiment were inductive probabilities being used to estimate hate speech by users of Gab and Reddit. They found some resulting data relationships which they refer to as "causal effects" both negative and positive.

It's quite the hand-wave at causality, however.

The results are still fascinating and I recommend reading about how users of the social media site, Gab, more frequently as a set of statistical probabilities use a truth-value of "family" association with sentences to discount whether the sentences contain hate speech.

Ahem, sort of like a positive "causal effect". Sort of. no. no. no.

Thanks, to this short explanation of a distribution shift -
https://dcai.csail.mit.edu/2024/imbalance-outliers-shift/#distribution-shift

I can better show you how the authors could have approached the problem concerning "causal effects" as a distribution shift problem which they claim to do, but look at their model a little more closely.

What is E? And why don't they describe the model or show some proof of E? Is it empathy? Did I just read that?

In the TEXT-TRANSPORT paper hate speech is discerned differently from Reddit to Gab, but the authors didn't quite understand that the difference is suggesting a covariate concept shift of the distribution based on inductive probabilities and that's just the tip of the iceberg.

Assumptions about additional meaning present in function words or emotional associations of words will be used by humans to conclude (determine) that text is processed appropriately and the rules are what may come.

Kurt Gödel was very particular about what made a sentence valid in a language and he was inclusive of anyone speaking the language by way of understanding all validity of said hypothetical sentence. Of course, this suggests when there is disagreement about meaning, people need to get logical and work a little modus ponens into their rhetoric now and again.

Paradoxes become more manageable this way.

If we were talking about causal effects for an app user determining whether some text was hate speech, which the paper attempts to do, I would stick with chipping away at the notion of "value" being rule-based in illogical nonintuitive ways that GPT 3.5 will not pull out of a hat for us.

The beginning of that MIT page starts here -

https://dcai.csail.mit.edu/2024/imbalance-outliers-shift/

There is also a free book about Distribution Shift in Machine Learning I picked up on my way here -

https://direct.mit.edu/books/edited-volume/3841/Dataset-Shift-in-Machine-Learning

Full Disclosure: I haven't read the book yet.

DEV Community

What Is A Causal Effect of Natural Language?

Top comments (0)

Read next

GneissWeb: AI Training Data Filter Boosts Quality 3x by Processing 6.5 Trillion Web Tokens

New LoRA Method Boosts AI Model Performance by 20% with Zero Extra Costs

Fast-track speech AI: New method trains language models on a single GPU in just 24 hours

Functions calling order in unit tests in C++