Evan Lin for Google Developer Experts

Posted on Jan 15 • Originally published at evanlin.com on Jan 15

Gemini 2.5 Flash: Analyzing PDF Papers Directly from Public URLs with the Gemini File API

#gemini #api #ai #programming

Background

While maintaining the linebot-arxiv project, I've always wanted to provide users with more in-depth paper analysis features. The existing "Translate Abstract" function can only process the Abstract section of a paper, but users often need to understand more: research methods, experimental results, chart analysis, and other complete content.

In the past, to achieve this function, you had to:

Download the PDF to the local server
Upload to Google Cloud Storage (GCS)
Use the Gemini API to analyze files on GCS

This process is not only complex (requires managing storage space), but also involves additional costs (GCS storage fees + traffic fees), which is too heavy for a simple LINE Bot.

But in January 2025, Google announced a major update in the official blog: Gemini API now supports reading files directly from Public URLs! This means we can skip GCS and directly let Gemini analyze PDF papers on arXiv.

This change brings not only technical simplification, but also huge advantages in terms of cost and maintenance.

Screen Display

Old Interface

"Know More" - Displays detailed paper information
"Translate Abstract (Longer)" - Only translates the Abstract
"Save Article" - Add to Favorites

New Interface

"📋 Detailed Information" - Clearer labels
"📑 AI Analyze PDF" - New Feature! Deep analysis of the full PDF
"💾 Save Article" - Visual improvements

Analysis Result Display

Complete structured analysis, including paper overview, research methods, key findings, and application value.

Main Repo https://github.com/kkdai/linebot-arxiv

Problems Encountered During Development

Problem 1: Limitations of the Old Implementation

Before implementing the paper analysis function, the existing approach was as follows:

// ❌ Old version - can only process Abstract text
func actionGPTTranslate(event *linebot.Event, values url.Values) {
    url := values.Get("url")
    result := getArticleByURL(url)

    // Can only get the text content of the Abstract
    sumResult, err := GeminiChat(
        fmt.Sprintf(PROMPT_Summarization, result[0].Summary.Body)
    )

    // Return the translation result...
}

This method has obvious limitations:

❌ Shallow Content - Can only analyze the Abstract, unable to delve into the main body of the paper

User: "What experimental methods did this paper use?"
Bot: [Cannot answer, because the Abstract usually does not detail the method specifics]

❌ Unable to understand charts - The Abstract is plain text, unable to analyze the charts and data in the paper

❌ Lack of complete context - Unable to understand the experimental design, result analysis, and discussion content

Problem 2: Past Solutions Were Too Heavy

To have Gemini analyze the full PDF, the only method in the past was:

# ❌ Old method - requires GCS
1. Download PDF to local
   pdf_data = requests.get(pdf_url).content

2. Upload to GCS
   blob = bucket.blob(f"papers/{paper_id}.pdf")
   blob.upload_from_string(pdf_data)

3. Use Gemini to analyze files on GCS
   file = genai.upload_file(f"gs://bucket/papers/{paper_id}.pdf")
   response = model.generate_content([file, prompt])

The problems with this method:

💰 Increased Costs

GCS storage fees: $0.020 per GB/month
GCS outbound traffic: $0.12 per GB
Need to manage file lifecycle (when to delete?)

🔧 High Complexity

Need to set up a GCS bucket
Need to manage permissions (Service Account)
Need to handle upload failures, clean up old files, and other logic

⏱️ Increased Delay

Download PDF: 1-3 seconds
Upload to GCS: 1-2 seconds
Gemini analysis: 10-20 seconds
Total: 12-25 seconds

For a LINE Bot, such an architecture is really too heavy!

Problem 3: Confusion in Using the New API

When I saw the announcement on the Google official blog, I immediately wanted to integrate this function. But when I looked at the documentation, I found some confusion:

JSON format of the official documentation:

{
  "parts": [
    {
      "fileData": {
        "mimeType": "application/pdf",
        "fileUri": "https://arxiv.org/pdf/2301.12345.pdf"
      }
    }
  ]
}

But how to write it in the Go SDK?

First attempt (based on JSON speculation):

// ❌ Compilation error
parts := []genai.Part{
    genai.FileData{
        MIMEType: "application/pdf",
        FileURI: pdfURL, // Error!
    },
}

The result is an error:

./gemini.go:80:4: unknown field FileURI in struct literal of type genai.FileData

It turns out that the field name in the Go SDK is not FileURI, but URI! This is a small difference between the SDK design and the API documentation.

Correct Solution

1. Google's New Feature: File API with Public URLs

According to the Google official announcement, Gemini 2.5 Flash and newer models now support:

Feature	Old Version (GCS only)	New Version (Public URL)
File Source	Must upload to GCS	✅ Supports Public URL
File Size	Maximum 20MB	✅ Maximum 1GB
File Type	PDF, Image, Video	✅ Same support
Additional Cost	GCS storage + traffic	✅ No additional cost
Configuration Complexity	Requires GCS bucket	✅ No additional configuration required
Processing Delay	Download + Upload + Analysis	✅ Direct analysis

Supported formats:

✅ PDF (maximum 1GB)
✅ Images (PNG, JPEG, WebP)
✅ Videos (MP4, MOV, AVI)
✅ Audio (MP3, WAV, FLAC)

Restrictions:

⚠️ Must be a publicly accessible URL
⚠️ Does not support URLs that require authentication
⚠️ Does not support localhost or intranet URLs

For arXiv papers, this feature is simply perfect! All paper PDFs are public URLs, without any additional processing.

2. Cost Comparison Analysis

Let's calculate the actual cost difference:

Old Method (Using GCS)

Assuming 1000 papers are analyzed per month, and each PDF is an average of 5MB:

Storage cost:
- Total storage: 1000 papers × 5MB = 5GB
- Monthly fee: 5GB × $0.020 = $0.10

Traffic cost:
- Gemini reading: 1000 papers × 5MB = 5GB
- Monthly fee: 5GB × $0.12 = $0.60

Management cost:
- Development time for automatic cleanup scripts
- Monitoring storage space
- Handling upload failure logic

Total: $0.70 + development and maintenance costs

New Method (Using Public URL)

Additional cost: $0.00
Management cost: $0.00

Total: $0.00

Save 100% of the additional cost! And there is no need to manage GCS at all.

3. Implementing a PDF URL Conversion Tool

First, I added a URL processing function in arxiv.go:

// ConvertToPDFURL converts the arXiv abs URL to a PDF URL
func ConvertToPDFURL(absURL string) (string, error) {
    // First normalize the URL to ensure it is a valid arXiv URL
    normalizedURL, err := NormalizeArxivURL(absURL)
    if err != nil {
        return "", err
    }

    // Extract the paper ID from the normalized URL
    paperID := getIDfromURL(normalizedURL)
    if paperID == "" {
        return "", errors.New("failed to extract paper ID from URL")
    }

    // Return the PDF URL
    return "https://arxiv.org/pdf/" + paperID + ".pdf", nil
}

// IsArxivPDFURL checks if it is a valid arXiv PDF URL
func IsArxivPDFURL(link string) bool {
    parsedURL, err := url.Parse(link)
    if err != nil {
        return false
    }

    // Check if it is arxiv.org and the path contains /pdf/
    if parsedURL.Host != "arxiv.org" {
        return false
    }

    // Check if the path matches the /pdf/xxxx.xxxx.pdf format
    re := regexp.MustCompile(`/pdf/\d{4}\.\d{4,5}(v\d+)?\.pdf`)
    return re.MatchString(parsedURL.Path)
}

Why restrict support to arXiv only?

Security considerations:

✅ arXiv URLs are all trusted academic sources
✅ Avoid being misused to analyze arbitrary PDFs
✅ Focus on the core functions of the project

Cost control:

✅ Limit the scope of use
✅ Predictable API usage

4. Implementing the Gemini PDF Analysis Function

Add a new function in gemini.go:

// GeminiPDF: Analyze PDF from Public URL and return the result
func GeminiPDF(pdfURL string, prompt string) (string, error) {
    ctx := context.Background()
    client, err := genai.NewClient(ctx, option.WithAPIKey(geminiKey))
    if err != nil {
        log.Println("Failed to create Gemini client:", err)
        return "", err
    }
    defer client.Close()

    // Use the gemini-2.5-flash model
    model := client.GenerativeModel("gemini-2.5-flash")
    value := float32(ChatTemperture)
    model.Temperature = &value

    // ✅ Key: Use the FileData structure to handle Public URLs
    parts := []genai.Part{
        genai.FileData{
            MIMEType: "application/pdf",
            URI: pdfURL, // Note: It's URI not FileURI!
        },
        genai.Text(prompt),
    }

    log.Println("Begin processing PDF from URL:", pdfURL)
    resp, err := model.GenerateContent(ctx, parts...)
    if err != nil {
        log.Println("Failed to generate content from PDF:", err)
        return "", err
    }
    log.Println("Finished processing PDF...")

    return printResponse(resp), nil
}

Implementation Highlights:

Correct field name

// ❌ Error
FileURI: pdfURL

// ✅ Correct
URI: pdfURL

MIMEType setting

MIMEType: "application/pdf" // Clearly tell Gemini this is a PDF

Temperature parameter

value := float32(ChatTemperture) // 0.2 to ensure stable analysis results
model.Temperature = &value

5. Designing Professional Analysis Prompts

In order to make the analysis results more professional and structured, I designed this Prompt:

const PROMPT_PDFAnalysis = `Please analyze this arXiv paper in Traditional Chinese, including:

📌 **Paper Overview**
- Research topic and purpose

🔬 **Research Methods**
- Used technologies and methods

💡 **Key Findings**
- Key results and contributions

🎯 **Application Value**
- Practical applications and impact

Please present it in a clear and professional manner, and respond in Traditional Chinese.`

Prompt Design Principles:

✅ Structured - Use clear sections to let the AI know the output format ✅ Use emoji - Improve readability, suitable for the LINE interface ✅ Clear language - Specify Traditional Chinese, avoid Simplified Chinese or English output ✅ Focus on key points - Focus on the 4 aspects that users care about most

6. Integrating into the Bot Processing Flow

Add new Actions and processing functions in bot.go:

// Add PostBack Action
const (
    ActionOpenDetail string = "DetailArticle"
    ActionTransArticle string = "TransArticle"
    ActionBookmarkArticle string = "BookmarkArticle"
    ActionAnalyzePDF string = "AnalyzePDF" // ✅ Added
    ActionHelp string = "Menu"
    ActonShowFav string = "MyFavs"
    ActionNewest string = "Newest"
    ActionRandom string = "Random"
)

// actionAnalyzePDF: Analyze PDF from Public URL using Gemini
func actionAnalyzePDF(event *linebot.Event, values url.Values) {
    arxivURL := values.Get("url")
    log.Println("actionAnalyzePDF: url=", arxivURL)

    // Step 1: Convert the arXiv URL to a PDF URL
    pdfURL, err := ConvertToPDFURL(arxivURL)
    if err != nil {
        log.Println("Error converting to PDF URL:", err)
        errString := fmt.Sprintf("❌ Failed to convert PDF URL: %s", err)
        bot.ReplyMessage(event.ReplyToken, linebot.NewTextMessage(errString)).Do()
        return
    }

    log.Println("Analyzing PDF:", pdfURL)

    // Step 2: Reply with a processing message first (reduce user anxiety)
    processingMsg := "🔍 Analyzing PDF paper, please wait..."
    bot.ReplyMessage(event.ReplyToken, linebot.NewTextMessage(processingMsg)).Do()

    // Step 3: Use Gemini to analyze the PDF
    analysisResult, err := GeminiPDF(pdfURL, PROMPT_PDFAnalysis)
    if err != nil {
        log.Println("Error analyzing PDF:", err)
        errString := fmt.Sprintf(`❌ PDF analysis failed: %s

This may be because:
• PDF file is too large
• API quota is insufficient
• Network connection problem

Please try again later or use the "Abstract Translation" function.`, err)
        bot.PushMessage(event.Source.UserID, linebot.NewTextMessage(errString)).Do()
        return
    }

    // Step 4: Format and send the result
    analysisResult = AddLineBreaksAroundURLs(analysisResult)
    resultMsg := fmt.Sprintf("📄 **PDF Paper Analysis Result** \n\n%s\n\n📎 Paper Link:\n%s",
        analysisResult, arxivURL)

    if _, err := bot.PushMessage(event.Source.UserID, linebot.NewTextMessage(resultMsg)).Do(); err != nil {
        log.Println("Error sending analysis result:", err)
    }
}

Flow Design Highlights:

URL Conversion: Convert from abs URL to PDF URL
Instant Feedback: Immediately reply with a "Processing" message
Error Handling: Detailed error messages + alternative solution suggestions
Result Formatting: Clear structure + original text link

7. UI Optimization - More Professional Interface

Update the button design of the Carousel Template:

// ✅ New version - professional and clear
columnList := []*linebot.CarouselColumn{}
for _, result := range records {
    var saveTogle string
    if exist, _ := InArray(result.ID, checkList); !exist {
        saveTogle = "💾 Save Article"
    } else {
        saveTogle = "🗑️ Remove Save"
    }

    detailData := fmt.Sprintf("action=%s&url=%s&user_id=%s",
        ActionOpenDetail, result.ID, userId)
    pdfData := fmt.Sprintf("action=%s&url=%s&user_id=%s",
        ActionAnalyzePDF, result.ID, userId) // ✅ Added
    SaveData := fmt.Sprintf("action=%s&url=%s&user_id=%s",
        ActionBookmarkArticle, result.ID, userId)

    tmpColumn := linebot.NewCarouselColumn(
        Image_URL,
        truncateString(result.Title, 35)+"..",
        truncateString(result.Summary.Body, 55)+"..",
        linebot.NewPostbackAction("📋 Detailed Information", detailData, "", "", "", ""),
        linebot.NewPostbackAction("📑 AI Analyze PDF", pdfData, "", "", "", ""), // ✅ New feature
        linebot.NewPostbackAction(saveTogle, SaveData, "", "", "", ""),
    )
    columnList = append(columnList, tmpColumn)
}

UI Improvement Comparison:

Item	Old Version	New Version	Improvement
Detailed Information	“Know More”	“📋 Detailed Information”	✅ Clearer
Summary Function	“Translate Abstract (Longer)”	“📑 AI Analyze PDF”	✅ More professional
Save Button	“Save Article”	“💾 Save Article”	✅ Visualization
Remove Button	“Remove Save”	“🗑️ Remove Save”	✅ Intuitive

Practical Application Examples

Example 1: Basic PDF Deep Analysis

User: [Search "transformer attention mechanism"]
Bot: [Displays paper list Carousel]

User: [Click "📑 AI Analyze PDF"]
Bot: 🔍 Analyzing PDF paper, please wait...

Bot: 📄 PDF Paper Analysis Result

📌 Paper Overview
This paper proposes the Transformer architecture, a new neural network model entirely based on the attention mechanism.
The main purpose is to solve the limitations of RNN and LSTM when processing long sequences...

🔬 Research Methods
The paper uses the following technologies:
1. Multi-Head Attention Mechanism
2. Positional Encoding
3. Feedforward Neural Network Layer
...

💡 Key Findings
- Achieved SOTA performance on machine translation tasks
- Training speed is 10 times faster than RNN
- Can be processed in parallel, improving efficiency
...

🎯 Application Value
The Transformer architecture later became the foundation of models such as BERT and GPT
Widely used in NLP, computer vision, speech recognition, and other fields
...

📎 Paper Link:
https://arxiv.org/abs/1706.03762

Example 2: Comparing Abstract Translation vs. PDF Analysis

Scenario: The same paper, using different functions

Using "Translate Abstract" (Old Function)

User: [Click "Translate Abstract (Longer)"]
Bot: This paper proposes a new attention mechanism model called Transformer.
The model is entirely based on the attention mechanism and does not use recurrent or convolutional structures.
It performs well on machine translation tasks and trains faster.

(About 50-100 words, only from the Abstract)

Using "AI Analyze PDF" (New Function)

User: [Click "📑 AI Analyze PDF"]
Bot: 📄 PDF Paper Analysis Result

📌 Paper Overview
- Research background: Limitations of RNN and LSTM
- Research purpose: Design a parallelizable sequence model
- Core innovation: Architecture entirely based on attention

🔬 Research Methods
1. Encoder-Decoder architecture
   - 6-layer encoder stack
   - 6-layer decoder stack

2. Multi-Head Attention Mechanism
   - Uses 8 attention heads
   - Allows the model to focus on information from different positions

3. Positional Encoding
   - Uses sin/cos functions
   - Preserves sequence order information

💡 Key Findings
1. Performance breakthrough
   - WMT 2014 English-German translation: 28.4 BLEU
   - WMT 2014 English-French translation: 41.8 BLEU
   - Surpasses all previous models

2. Training efficiency
   - Trained for 3.5 days using 8 GPUs
   - More than 10 times faster than RNN models

3. Interpretability
   - Attention weights can be visualized
   - Understand how the model makes decisions

🎯 Application Value
1. Industry impact
   - Adopted by Google Translate
   - Became the standard architecture for NLP

2. Subsequent development
   - BERT (2018)
   - GPT series (2018-2024)
   - Vision Transformer (2020)

3. Cross-domain applications
   - Natural Language Processing
   - Computer Vision
   - Speech Recognition
   - Bioinformatics

📎 Paper Link:
https://arxiv.org/abs/1706.03762

(About 300-500 words, covering the full paper content)

Comparison Summary:

Feature	Translate Abstract	AI Analyze PDF	Difference
Content Source	Abstract only	Full PDF	⬆️ 10x deeper
Word Count	50-100 words	300-500 words	⬆️ 5x more detailed
Method Details	❌ None	✅ Detailed explanation	Significantly improved
Experimental Results	❌ Brief	✅ Complete data	Quantifiable
Chart Understanding	❌ Unable	✅ Supported	Qualitative leap
Conclusion Discussion	❌ None	✅ Included	More complete

Example 3: Error Handling Demonstration

Scenario 1: PDF cannot be accessed
User: [Click "📑 AI Analyze PDF" of an invalid paper]
Bot: ❌ Failed to convert PDF URL: URL does not contain a valid arXiv ID

Scenario 2: API error
User: [Click "📑 AI Analyze PDF"]
Bot: 🔍 Analyzing PDF paper, please wait...
Bot: ❌ PDF analysis failed: API quota exceeded

This may be because:
• PDF file is too large
• API quota is insufficient
• Network connection problem

Please try again later or use the "Abstract Translation" function.

Technical Specifications

Gemini File API Structure

According to the official documentation, the request format for the File API:

{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "fileData": {
            "mimeType": "application/pdf",
            "fileUri": "https://arxiv.org/pdf/1706.03762.pdf"
          }
        },
        {
          "text": "Please analyze this paper"
        }
      ]
    }
  ]
}

Go SDK Corresponding Writing

In the google-generative-ai-go SDK:

// ✅ Correct writing
parts := []genai.Part{
    genai.FileData{
        MIMEType: "application/pdf",
        URI: "https://arxiv.org/pdf/1706.03762.pdf",
    },
    genai.Text("Please analyze this paper"),
}

response, err := model.GenerateContent(ctx, parts...)

Field Name Correspondence:

JSON API	Go SDK	Description
`fileUri`	`URI`	⚠️ Note the case difference
`mimeType`	`MIMEType`	Standardized naming

Supported File Formats and Limitations

According to the Google official announcement:

Supported Formats:

✅ PDF: Maximum 1GB
✅ Images: PNG, JPEG, WebP, GIF
✅ Videos: MP4, MOV, AVI, MPEG
✅ Audio: MP3, WAV, FLAC, AIFF

URL Restrictions:

✅ Must be an HTTPS URL
✅ Must be publicly accessible
❌ Does not support URLs that require authentication
❌ Does not support localhost or intranet IPs
❌ Does not support URLs after redirect (in some cases)

For arXiv:

✅ All PDFs are public URLs
✅ Directly available, no additional processing required
✅ File size is usually between 5-50MB, far less than the 1GB limit

Supported Models

Model	Public URL Support	File Size Limit	Recommended Use
gemini-2.5-flash	✅	1GB	Recommended (fast and cheap)
gemini-2.5-pro	✅	1GB	Complex analysis
gemini-3.0-pro	✅	1GB	Highest quality
gemini-2.0-flash-lite	❌	-	Does not support File API

Environment Setup

Required Environment Variables

# Gemini API Key (required)
export GOOGLE_GEMINI_API_KEY="your-api-key-here"

# LINE Bot Settings
export ChannelSecret="your-channel-secret"
export ChannelAccessToken="your-channel-access-token"

# Optional: GitHub Issue as a database
export GITHUB_URL="your-github-repo-url"

Settings No Longer Needed

Due to the change to Public URL, the following settings are no longer needed:

# ❌ Not needed anymore GCS
# GOOGLE_CLOUD_PROJECT=...
# GOOGLE_APPLICATION_CREDENTIALS=...
# GCS_BUCKET_NAME=...

# ❌ Not needed anymore local storage
# PDF_STORAGE_PATH=...
# MAX_STORAGE_SIZE=...

This greatly simplifies the deployment process!

Local Testing

# 1. Clone the project
git clone https://github.com/kkdai/linebot-arxiv.git
cd linebot-arxiv

# 2. Set environment variables
export GOOGLE_GEMINI_API_KEY="your-api-key"
export ChannelSecret="your-channel-secret"
export ChannelAccessToken="your-channel-access-token"

# 3. Compile
go build

# 4. Execute
./linebot-arxiv

Performance and Cost Analysis

Processing Depth Comparison

Metric	Translate Abstract	AI Analyze PDF	Improvement
Content Source	Abstract	Full PDF	⬆️ 10x
Word Count	50-100	300-500	⬆️ 5x
Method Details	❌	✅ Detailed	Qualitative leap
Experimental Data	❌	✅ Complete	Quantifiable
Chart Analysis	❌	✅ Supported	New capability
Conclusion Discussion	❌	✅ Included	More complete

Speed Comparison

Stage	Translate Abstract	AI Analyze PDF	Description
API Call	1 time	1 time	Same
Processing Time	3-5 seconds	15-30 seconds	More PDF content
User Waiting	Short	Medium	Has processing prompt

Optimization Strategies:

✅ Provide "Processing" prompt to reduce anxiety
✅ Use PushMessage for asynchronous replies
✅ Allow users to choose fast or deep analysis

Cost Comparison: GCS vs. Public URL

Plan A: Using GCS (Old Method)

Assuming 1000 PDF analyses per month, averaging 5MB/paper:

GCS storage cost:
- Assuming storage for 7 days
- Average storage: 1000 × 5MB × 7/30 = 1.17GB
- Monthly fee: 1.17GB × $0.020 = $0.023

GCS outbound traffic:
- Each Gemini read: 1000 × 5MB = 5GB
- Monthly fee: 5GB × $0.12 = $0.60

Maintenance cost:
- Development of upload/delete logic: 8 hours
- Monitoring storage space: 2 hours per month
- Handling errors: 1 hour per month
- Engineer hourly rate $50: $550

Total for the first month: $0.623 + $550 = $550.62
Total for subsequent months: $0.623 + $150 = $150.62

Plan B: Using Public URL (New Method)

Additional cost: $0.00
Maintenance cost: $0.00
Development time: 2 hours (only URL conversion logic)

Total for the first month: $0.00 + $100 = $100
Total for subsequent months: $0.00

Save in the first month: $450** Save each month: $150+**

Gemini API Cost

(The cost is the same as using GCS, because it's all calling the Gemini API)

Using gemini-2.5-flash:

Input: $0.075 / 1M tokens
Output: $0.30 / 1M tokens

Assuming each PDF analysis:

Input tokens: ~50,000 (PDF content + prompt)
Output tokens: ~1,500 (analysis result)

Cost per time:
- Input: 50,000 × $0.075 / 1M = $0.00375
- Output: 1,500 × $0.30 / 1M = $0.00045
- Total: $0.0042

Monthly cost (1000 times):
- 1000 × $0.0042 = $4.20

Total Cost Comparison:

Item	GCS Plan	Public URL Plan	Savings
Gemini API	$4.20	$4.20	$0
GCS Fees	$0.62	$0.00	$0.62
Development & Maintenance	$150+	$0	$150+
Total	$154.82	$4.20	$150.62 (97%)

Development Experience

1. Google's New Feature Completely Changes the Game

When I saw Google's announcement supporting Public URLs, my first reaction was: "How much trouble will this save!"

Pain points in the past:

// ❌ What needed to be done in the past
1. Download PDF
   resp, _ := http.Get(pdfURL)
   data, _ := ioutil.ReadAll(resp.Body)

2. Store it locally
   ioutil.WriteFile("/tmp/paper.pdf", data, 0644)

3. Upload to GCS
   bucket.Upload(context, "papers/xxx.pdf", data)

4. Generate GCS URL
   gsURL := "gs://bucket/papers/xxx.pdf"

5. Use Gemini to analyze
   file := genai.upload_file(gsURL)
   response := model.generate_content([file, prompt])

6. Remember to clean up
   Regularly delete old files, otherwise GCS fees will accumulate

Now only need:

// ✅ Now only need one line
response := GeminiPDF(pdfURL, prompt)

From 6 steps to 1 step, from needing to manage storage to not needing to manage at all, the value brought by this change far exceeds the technology itself.

2. Differences between SDK Documentation and Actual Implementation

The FileURI vs URI issue I encountered this time taught me: Never just look at the API documentation, but check the actual definition of the SDK.

// My incorrect reasoning process:
// 1. Look at the JSON API documentation: { "fileUri": "..." }
// 2. Guess the Go struct: FileURI string
// 3. Compilation failed!

// Correct approach:
// 1. Look at the JSON API documentation
// 2. View the SDK source code or examples
// 3. Confirm the actual field name

Lesson:

✅ JSON API's fileUri → Go SDK's URI
✅ Naming conventions do not necessarily correspond exactly
✅ Look at the official examples first, then start writing

This small problem took me 10 minutes to debug, but the experience I learned is very valuable.

3. Thinking from "Function Implementation" to "User Experience"

When I first implemented it, I only considered the function: "It's good to be able to analyze PDFs". But when I actually used it, I found:

Problem: Poor user experience

User: [Click the button]
... (Wait 20 seconds) ...
User: "Is it broken?" (Start to worry)
Bot: [Finally replies]

Improvement: Add instant feedback

User: [Click the button]
Bot: 🔍 Analyzing PDF paper, please wait...
User: "Oh, it's processing" (Relieved)
... (Wait) ...
Bot: 📄 PDF Paper Analysis Result...

This small change greatly improved the user experience. The key is not how strong the function is, but how the user feels.