DEV Community: Justin Macorin

The Future of Natural Language APIs

Justin Macorin — Sun, 17 Mar 2024 05:26:50 +0000

Note: Natural Language APIs is a VERY early concept. Ideas in this article may be not be representative of the direction that software engineering is heading in.

The world of software engineering is constantly evolving, and one of the most exciting developments on the horizon is the potential for APIs to communicate using natural language. This shift could revolutionize the way developers interact with APIs, making the process more intuitive and accessible. By embracing natural language, APIs could become more flexible, adaptable, and user-friendly, ultimately leading to more efficient and effective software development.

Limitations of Current API Communication

Currently, REST APIs rely on POST BODY schemas to facilitate communication between different systems. While this approach has been the standard for years, it can be cumbersome and time-consuming for developers to navigate. The need for precise formatting and strict adherence to predetermined schemas can lead to errors and delays in the development process. Additionally, the tight coupling between front-end and back-end systems can make it difficult to adapt to changing requirements or introduce new features.

The Promise of Natural Language APIs

Natural language APIs offer a promising alternative to traditional REST APIs. By allowing developers to communicate with APIs using plain English or other natural languages, the barriers to entry for API integration could be significantly reduced. This approach would enable developers to focus on the functionality they need, rather than getting bogged down in the details of complex schemas. Natural language APIs could also make it easier for non-technical stakeholders to understand and contribute to the development process, fostering greater collaboration and innovation.

Challenges and Opportunities

While the potential benefits of natural language APIs are significant, there are also challenges that need to be addressed. One of the primary concerns is the need for accurate interpretation of natural language queries and commands. This requires sophisticated natural language processing (NLP) capabilities and well-defined schemas to ensure that the API can understand and respond appropriately to user input. However, as NLP technologies continue to advance, these challenges are likely to be overcome, paving the way for more widespread adoption of natural language APIs.

The Road Ahead

As the software engineering community explores the possibilities of natural language APIs, it is clear that this technology has the potential to transform the way we build and interact with software. By embracing natural language, APIs could become more accessible, flexible, and user-friendly, enabling developers to focus on creating innovative solutions rather than navigating complex schemas. While there are challenges to be addressed, the future of API communication looks bright, and natural language APIs are poised to play a significant role in shaping that future.

Originally published on PromptDesk.

Small Language Models are Going to Eat the World.

Justin Macorin — Mon, 22 Jan 2024 02:14:28 +0000

Today, Large Language Models (LLMs) typically require internet access. As prompt-based applications become ubiquitous, there is a high likelihood we slowly begin to see a transition from internet-based models to locally hosted models.

Local models are nothing new. Google product users are often pushed to download local models for Google Maps, Google Translate, and Text2Speech. These models run locally for four primary purposes:

speed
reliability
privacy
cost

Benefits

Speed

Local models have no network latency. They run locally, and instructions and data transfers happen closer to the application layer, resulting in increased performance.

Reliability

Local models are self-reliant. They don't require additional computers to operate and don't rely on 3rd party service providers. They run as stand-alone and won't break if internet connectivity is lost.

Privacy

Private information is processed locally and never shared with another provider. Information passed into these models may contain private or confidential information that an external processor should not process.

Cost

Local models require zero hosting. Models may run frequently, and costs involved in processing data regularly at scale may become unaffordable or may better be absorbed by a local device.

How can we make local models a reality?

Python is the language of choice to run LLMs. However, we know that embedded devices, mobile apps, and web servers often use different languages to run and operate efficiently.

To bridge the gap in SDKs for accessing large language models across various platforms, engineers should consider developing and integrating multi-language libraries and frameworks that are compatible with mobile, embedded, and diverse server environments. Embracing innovation and flexibility in these developments is critical, as large language models represent a new technological frontier rather than merely enhancing existing tools.

This original article can be found here: https://promptdesk.ai/articles/small-languages-models-are-going-to-eat-the-world

PDLC: Prompt Development Life Cycle

Justin Macorin — Tue, 02 Jan 2024 03:59:03 +0000

Prompt engineering, like software engineering, has a development life cycle. As we build, measure, and integrate these prompts into an application, they can improve over time and be fine-tuned for increased performance.

1. Initial Build

In the initial build phase, we build an initial prompt. This prompt does not need to be perfect.
It can incorporate techniques such as zero-shot, few-shot, chain-of-thought, choice-shuffle, etc.

The goal of the initial prompt is to build a prompt so that:

it works 80% of the time
it can be integrated into the product
we can start collecting data for review

2. Measure and Track

In the measuring and tracking phase, we aim to collect as much product and prompt usage information as possible.
We store generated prompt output and corresponding variables in a database or logging environment.
Measuring and tracking output will allow us to optimize and fine-tune future models and ensure they work.

3. Optimize

Optimization aims to review historical prompt data and understand areas of opportunity, edge cases, exceptions, and overall performance.
We modify the prompt to increase the accuracy as much as possible.

Optimization will:

help us save time in the dataset review process
increase prompt performance
identify areas where a prompt break-down is required

4. Create Training Dataset

To create a training dataset, we review a large number of samples.
Some samples need to be corrected, and others require additional review, input, and feedback before being accepted as part of a training dataset.
Creating a training dataset is often time-consuming, but it is a required component of AI-related development work.

The size of the dataset will depend on:

complexity of output
LLM models selected to fine-tune
quality considerations

5. Fine-tune

The last and final step of the prompt development life cycle (PDLC) is to fine-tune an LLM or another type of model based on the training dataset.
If a sufficiently large training dataset is created, we can fine-tune or train a smaller model with similar or better performance.
Once a model is fine-tuned, we should continue to log, track, and review data to optimize the model further in the future if required.

You can view the original article here.

Top 4 open source LLM prompt management platforms.

Justin Macorin — Thu, 14 Dec 2023 16:13:13 +0000

PromptDesk

The easiest and fastest way to build prompt-based applications.

Top 4 features:

Collaborative GUI Prompt Builder: Featuring a user-friendly and sophisticated interface, this builder streamlines the creation of complex prompts, enabling users to craft intricate prompt structures with ease.
100% LLM Support: PromptDesk offers seamless integration with all large language models without restriction, limit or wait.
Fine-Tuning and Data Management: Users have access to detailed logs and histories, facilitating the fine-tuning of datasets and prompts for optimized performance and tailored application responses.
Python SDK: Accelerate prompt-to-code which allows for effortless integration of prompts created in the GUI with Python source code.

LiteLLM

Call 100+ LLM models using the OpenAI format.

Top 4 features:

Unified API Format: It allows calling various LLM APIs using the OpenAI format, simplifying integration with multiple providers like Azure, Cohere, Anthropic, etc.
Consistent Output and Exception Mapping: Guarantees consistent output format and maps common exceptions across different providers to OpenAI exception types.
Load Balancing and Proxy Management: Supports load balancing across multiple deployments and manages calling 100+ LLMs in OpenAI format.
Logging and Observability: Provides predefined callbacks for integration with various logging and monitoring tools.

LLMClient

A caching and debugging proxy server for LLM users.

Top 4 features:

Multi-LLM Support: It supports various language models, including OpenAI's GPT models, Anthropic's Claude, Azure's AI models, Google's AI Text models, and more.
Function (API) Calling with Reasoning (CoT): Enables language models to reason through tasks and interact with external data via API calls. This includes built-in functions like a code interpreter.
Detailed Debug Logs and Troubleshooting Support: Provides tools for debugging, including comprehensive logs and a Web UI for tracing and metrics.
Long Term Memory and Vector DB Support (Built-in RAG): Supports long-term memory for maintaining context in conversations and retrieval-augmented generation (RAG) with vector database support for enhanced query responses.

GPTCache

A semantic cache for LLMs that fully integrates with LangChain and llama_index.

Top 4 features:

Semantic Caching: Utilizes semantic analysis to cache similar queries, enhancing efficiency and reducing redundant API calls to language models.
Modular Design: Offers flexibility in integrating various components like LLM adapters, multimodal adapters, and embedding generators for customized caching solutions.
Support for Multiple LLMs and Multimodal Models: Compatible with a range of large language models and multimodal models, facilitating broad application scenarios.
Diverse Storage and Vector Store Options: Supports a variety of cache storage systems and vector stores, allowing for scalable and adaptable cache management.