DEV Community

Cover image for Host LLMs from Your Laptop Using LM Studio and Pinggy
Lightning Developer
Lightning Developer

Posted on • Edited on

2 1 1 1

Host LLMs from Your Laptop Using LM Studio and Pinggy

Introduction

In the era of generative AI, software developers and AI enthusiasts are continuously seeking efficient ways to deploy and share AI models without relying on complex cloud infrastructures. LM Studio provides an intuitive platform for running large language models (LLMs) locally, while Pinggy enables secure internet exposure of local endpoints. This guide offers a step-by-step approach to hosting LLMs from your laptop using LM Studio and Pinggy.

Why Host LLMs Locally?

Hosting LLMs on your laptop offers several advantages:

  • Cost-Effective: No need for expensive cloud instances.
  • Data Privacy: Your data remains on your local machine.
  • Faster Prototyping: Low-latency model inference.
  • Flexible Access: Share APIs with team members and clients.

Combining LM Studio and Pinggy ensures a seamless deployment process.

Step 1: Download and Install LM Studio

Download and Install LM studio

Visit the LM Studio Website

  1. Go to LM Studio's official website.

  2. Download the installer for your operating system (Windows, macOS, or Linux).

Install LM Studio

  1. Follow the installation prompts.

  2. Launch the application once installation is completed.

Download Your Model

  1. Open LM Studio and navigate to the Discover tab.

  2. Browse available models and download the one you wish to use.

Run your Model

Step 2: Enable the Model API

Model API

Open the Developer Tab

  1. In LM Studio, click on the Developer tab.

  2. Locate the Status button in the top-left corner.

Start the API Server

  1. Change the status from Stop to Run.

  2. This launches the model's API server at http://localhost:1234.

Test the API Endpoint

Copy the displayed curl command and test it using Postman or your terminal:

curl http://localhost:1234/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2-0.5b-instruct",
    "messages": [
      { "role": "system", "content": "Always answer in rhymes." },
      { "role": "user", "content": "What day is it today?" }
    ],
    "temperature": 0.7,
    "max_tokens": -1,
    "stream": false
}'
Enter fullscreen mode Exit fullscreen mode

Run Code
Step 3: Expose Your LM Studio API via Pinggy

Run the Pinggy command. You do not need to install anything.

Open your terminal and run the following command:

ssh -p 443 -R0:localhost:1234 a.pinggy.io
Enter fullscreen mode Exit fullscreen mode

Enter your Token

If prompted, enter your Pinggy authentication token.

Share the Public URL

Once connected, Pinggy generates a secure public URL, such as:

https://abc123.pinggy.io
Enter fullscreen mode Exit fullscreen mode

URL
If the model responds, your API is active locally.
Share this URL with collaborators or use it for remote integration.

code

cmd

Advanced Tips and Best Practices:

  1. Secure Your API:

    Add basic authentication to your tunnel:

    ssh -p 443 -R0:localhost:1234 -t a.pinggy.io b:username:password
    
    

    This ensures that only authorized users can access your public endpoint.

  2. Monitor Traffic:

    Use Pinggy's web debugger to track incoming requests and troubleshoot issues.

  3. Use Custom Domains:

    With Pinggy Pro, map your tunnel to a custom domain for branding and credibility.

  4. Optimize Performance:

    Ensure your local machine has sufficient resources to handle multiple requests efficiently.

Troubleshooting Tips:

  1. Model Fails to Start:
    Verify system requirements and compatibility, and check LM Studio logs for error messages to troubleshoot the issue.

  2. Connection Timeouts:

    Use Pinggy's TCP mode for unstable networks:

    while true; do
    ssh -p 443 -o StrictHostKeyChecking=no -R0:localhost:1234 
    a.pinggy.io;
    sleep 10; done
    
  3. Incorrect API Response:

  • Validate curl command syntax.
  • Ensure LM Studio is configured correctly.

Conclusion

Combining LM Studio's powerful LLM deployment with Pinggy's secure tunneling enables developers to share AI models easily, without cloud dependencies. This solution empowers rapid prototyping, remote collaboration, and seamless integration—all while maintaining full control over data and performance.

Sentry image

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →

Top comments (0)

Cloudinary image

Video API: manage, encode, and optimize for any device, channel or network condition. Deliver branded video experiences in minutes and get deep engagement insights.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay