Alvaro

Posted on Jul 31, 2025 • Edited on Apr 25 • Originally published at theengineeringtax.com

AI In The Workplace: Are LLMs Hype or Reality?

This past year, the buzz word in technology has undeniably been "AI". But what exactly do we mean when we say AI?

Often, our comprehension of terms and definitions can be clouded due to the ambiguity in the language used. A prime example is the term "Artificial Intelligence" (AI). This term used to encompass a broad spectrum of technologies, but recently, it has become largely associated with Large Language Models (LLMs).

What are we talking about?

AI, in its purest form, is the ability of a machine or computer program to think and learn. It is the theory and development of computer systems able to perform tasks that typically require human intelligence.

Meanwhile, LLMs are defined in this Wikipedia article as:

A language model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process.[1] LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word

It goes without saying that we have yet to reach the real definition of AI, with neither LLMs nor other technologies.

The Psychology of LLMs Hype

What has made these large language models so intriguing over the past year, prompting such widespread interest?

Their ability to “bullshiting with confidence”. Just try this out yourself, ask any question to an LLM and just tell them they are mistaken and request another option. You'll typically receive an apology followed by an assertive, alternative response.

To clarify, these software models aren't intentionally deceptive. They simply aim to provide responses that align with your queries. Meanwhile, we as recipients are prone to accept information as accurate, especially when it's delivered with apparent confidence. This applies to both machine-generated responses and human communication. Even this paragraph, written by a non-psychologist, likely seems plausible because it's framed convincingly.

So how does an LLM “bullshiting with confidence”. LLMs are trained to recognize patterns in data and make predictions based on these patterns. Essentially, they are statistical functions that predict, given the words in a question, the words that will form a compelling response. This phenomenon of non-accurate but compelling responses goes by the name of hallucinations.

LLMs at the Workplace

Are LLMs useless? Not at all. Despite the caution required due to the nature of statistically generated responses, these systems can indeed provide substantial value.

Let's explore potential applications:

User-Facing Products

Interestingly, this technology might just be the first to convince humans they're interacting with another human or an intelligent entity. This opens up new product opportunities where the perceived "human interface" might have more value than the precise accuracy of the content.

Potential applications could include:

First-tier customer support: By integrating LLMs with user manuals, we could develop a chatbot interface for customers.
Sales representatives: LLMs could be utilized to create more human-like, machine-driven robot calls or chatbots.

Does this mean human roles will be totally replaced? No, because LLMs lack a true understanding of the problem or the conversation. Therefore, a second level of human intervention remains essential. However, this technology could help businesses scale their operations in a novel way.

⚠️ Personally, I would not use the current technology for products that require accuracy. (example: tax calculations, finances, etc.).

Company Tools

Typically, businesses possess an extensive amount of data that encapsulates their knowledge base. Filtering this data can be challenging, especially when it's scattered across various systems, making it tough to query. Leveraging LLMs can significantly streamline this process and reduce the mental strain of pinpointing the required information.

⚠️ Keep in mind, LLMs do normally not provide precise responses. Thus, it's crucial to critically assess the answers and cross-reference with the original data for accuracy.

Work Augmentation

This tool has a wide range of applications that can be incorporated into daily workflows. Let's explore some common utilization.

Writing Assistance

LLMs can provide valuable support in crafting documents, emails, and other written communications. They offer grammatically correct and contextually fitting sentence suggestions, enhancing the overall writing process.

As a general rule, this is a highly productive use of the technology. Just ensure to review the generated suggestions for maximum benefit.

Reference Tool

LLMs can serve as a dynamic reference resource, drawing on the vast expanse of knowledge they've been trained on. They function like a sophisticated search engine, delivering consolidated information.

⚠️ However, as LLMs may not always provide accurate responses, it's crucial to approach the information critically and cross-check with original data sources when necessary.

Coding

Within the realm of coding, LLMs can help in code generation and even in writing documentation. This is possibly the field that can affect most of the readers of this newsletter, so let's do a deep dive on this point.

From my professional experience, we've opted not to utilize this technology so far. The main factors influencing this decision include:

Quality Inheritance: The quality of code that LLMs generate is directly linked to the quality of the training data. If the model is trained on data that includes poor or flawed code, the generated code will inherit those flaws.
Cognitive Load: While LLMs can simplify routine coding processes, they can also increase the cognitive burden on developers who need to review and comprehend the generated code, ensuring it meets expectations. Developers may face a choice of accepting the code as-is with potential issues or spend additional time deciphering the proposed code.
Licensing: Employing LLMs may lead to potential breaches of software licenses. If the model is trained on code under copyleft licenses (e.g., GPL), the generated code could be seen as derivative, thus resulting in licensing complications.

Additionally, upon reviewing the data accumulated over the duration this technology has been in existence, we can observe the following results:

Supporting Data:
- Microsoft Experiment: The results were that developers using copilot were able to complete the task 55.8% faster.
- Pinterest Experiment (Link): They found that their engineers who commit code less often increased their commit rate by 50-100%, and engineers who commit code the most often became about ~40% more productive.
Detracting Data:
- Git Clear Research (Link): analyzed approximately 153 million changed lines of code, they found code churn (the percentage of lines that are reverted or updated less than two weeks after being authored) is projected to double in 2024 compared to its 2021, pre-AI baseline.
- Security Researchers (Link): Explored Copilot's performance on three distinct code generation axes (examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains). In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable.

⚠️ While the technology shows promise, it's not fully developed yet.

Our studies indicate that while short-term productivity may increase, this comes at the expense of code quality. This aligns with the concerns we initially discussed.

Our studies indicate that while short-term productivity may increase, this comes at the expense of code quality. This aligns with the concerns we initially discussed.

As we conclude this article, it's important to highlight the potential vulnerabilities associated with Large Language Models (LLMs) in the workplace. I strongly recommend familiarizing yourself with potential threats and reading more about this topic via the OWASP TOP 10 for LLMs resource.

Conclusion

The term "Artificial Intelligence" (AI) has recently become synonymous with Large Language Models (LLMs). However, true AI, the ability of a machine to think and learn, has not been achieved with LLMs or other technologies.

LLMs have gained popularity due to their ability to generate convincing responses, but these responses are merely statistical predictions based on input data. Despite this, LLMs can be valuable in the workplace, serving as user-facing products, company tools, and work augmentation. However, caution is advised due to their lack of accuracy.

Lastly, it's imperative that we bear in mind our social responsibility as creators and innovators in the realm of technology. With the current global population standing at 8.1 billion, each decision we make, each technological advancement we introduce, has the potential to impact a vast number of lives. We should, therefore, strive to prioritize the wellbeing of each individual in this global community, ensuring that our contributions to technology are both ethical and beneficial to all.