<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Subharanjan Sahoo</title>
    <description>The latest articles on DEV Community by Subharanjan Sahoo (@sahoo1987).</description>
    <link>https://dev.to/sahoo1987</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3974473%2F880b23e1-7128-4c0e-ae09-8a1b494a1ca5.png</url>
      <title>DEV Community: Subharanjan Sahoo</title>
      <link>https://dev.to/sahoo1987</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sahoo1987"/>
    <language>en</language>
    <item>
      <title>The Image Upload Problem I See Everywhere: Compressing Before Resizing</title>
      <dc:creator>Subharanjan Sahoo</dc:creator>
      <pubDate>Sun, 21 Jun 2026 11:25:28 +0000</pubDate>
      <link>https://dev.to/sahoo1987/the-image-upload-problem-i-see-everywhere-compressing-before-resizing-35eo</link>
      <guid>https://dev.to/sahoo1987/the-image-upload-problem-i-see-everywhere-compressing-before-resizing-35eo</guid>
      <description>&lt;p&gt;There is a tiny image workflow mistake I keep seeing:&lt;/p&gt;

&lt;p&gt;People try to compress a huge image before they resize it.&lt;/p&gt;

&lt;p&gt;It sounds harmless. You have an image that is too large, so you open a compression tool and push the quality slider down until the file fits. Sometimes that works. Other times the image gets blurry, text becomes soft, product details disappear, and the file is still larger than it needs to be.&lt;/p&gt;

&lt;p&gt;The issue is usually not compression itself.&lt;/p&gt;

&lt;p&gt;The issue is that the image dimensions were too large for the job in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  A common example
&lt;/h2&gt;

&lt;p&gt;Someone takes a photo on a phone. The image might be 3000px, 4000px, or even wider.&lt;/p&gt;

&lt;p&gt;Then they need to upload it somewhere:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a profile form&lt;/li&gt;
&lt;li&gt;a blog post&lt;/li&gt;
&lt;li&gt;a product page&lt;/li&gt;
&lt;li&gt;a support ticket&lt;/li&gt;
&lt;li&gt;a government or college portal&lt;/li&gt;
&lt;li&gt;a small business website&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most of those places do not need the original full-size photo.&lt;/p&gt;

&lt;p&gt;If the final display area is only 800px wide, uploading a 4000px-wide image is usually wasteful. The browser, CMS, or platform may resize it anyway, but the original file still has to be uploaded, processed, stored, and sometimes delivered.&lt;/p&gt;

&lt;p&gt;So if you only compress the original file, you are asking compression to solve a problem that resizing should have handled first.&lt;/p&gt;

&lt;h2&gt;
  
  
  My default workflow now
&lt;/h2&gt;

&lt;p&gt;For everyday publishing, I usually think in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Decide where the image will be used.&lt;/li&gt;
&lt;li&gt;Resize it to a sensible width for that use.&lt;/li&gt;
&lt;li&gt;Choose the right format.&lt;/li&gt;
&lt;li&gt;Compress it.&lt;/li&gt;
&lt;li&gt;Open the final image and check it like a human, not just by file size.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last step matters more than people think. A file size target is not the same thing as a usable image.&lt;/p&gt;

&lt;p&gt;A 90 KB image can still be bad if faces, screenshots, signatures, text, or product details are unclear.&lt;/p&gt;

&lt;h2&gt;
  
  
  Resize first, then compress
&lt;/h2&gt;

&lt;p&gt;Resizing removes unnecessary pixels.&lt;/p&gt;

&lt;p&gt;Compression tries to store the remaining pixels more efficiently.&lt;/p&gt;

&lt;p&gt;Those are different jobs.&lt;/p&gt;

&lt;p&gt;If you start with a 4000px image and compress it aggressively, the tool still has to preserve information across a huge canvas. If you resize that image to 1200px first, there is simply less image data to store.&lt;/p&gt;

&lt;p&gt;That usually means you can use gentler compression and still end up with a smaller file.&lt;/p&gt;

&lt;p&gt;In practice, this often gives a cleaner result than forcing heavy compression on the original.&lt;/p&gt;

&lt;h2&gt;
  
  
  Format matters too
&lt;/h2&gt;

&lt;p&gt;The format choice depends on the image.&lt;/p&gt;

&lt;p&gt;For normal photos, JPG is still a practical default.&lt;/p&gt;

&lt;p&gt;For graphics, screenshots, transparent images, and UI elements, PNG or WebP may make more sense.&lt;/p&gt;

&lt;p&gt;For modern web delivery, WebP can be a good option when browser/support constraints are acceptable.&lt;/p&gt;

&lt;p&gt;For phone photos, especially from iPhones, HEIC may need conversion first if the target site does not accept it.&lt;/p&gt;

&lt;p&gt;This is another reason image workflows can feel annoying. The problem is rarely just one thing. It is often dimensions, format, file size, and upload requirements all colliding at once.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built into PixelKit
&lt;/h2&gt;

&lt;p&gt;I built PixelKit because I kept running into these small image chores while publishing things.&lt;/p&gt;

&lt;p&gt;The tools I use most in this kind of workflow are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://pixelkit.tools/tools/resize-image" rel="noopener noreferrer"&gt;Resize Image&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pixelkit.tools/tools/compress-image" rel="noopener noreferrer"&gt;Compress Image&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pixelkit.tools/heic-to-jpg" rel="noopener noreferrer"&gt;HEIC to JPG&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://pixelkit.tools/tools/convert-image" rel="noopener noreferrer"&gt;Convert Image&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The point is not that every image needs a special toolchain. Most do not.&lt;/p&gt;

&lt;p&gt;The point is that the order matters:&lt;/p&gt;

&lt;p&gt;resize first, convert if needed, then compress.&lt;/p&gt;

&lt;p&gt;That small change fixes a surprising number of upload and quality problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  A few practical rules
&lt;/h2&gt;

&lt;p&gt;For blog images, I usually avoid uploading giant originals unless there is a good reason.&lt;/p&gt;

&lt;p&gt;For product photos, I check the final image at the size buyers will actually see it. If the product texture, label, edge, or color looks wrong, the file is not ready.&lt;/p&gt;

&lt;p&gt;For screenshots, I avoid over-compressing because text gets ugly quickly.&lt;/p&gt;

&lt;p&gt;For documents, IDs, signatures, or anything with small text, I check readability before uploading.&lt;/p&gt;

&lt;p&gt;For social posts, I crop/resize for the platform shape before compression, not after.&lt;/p&gt;

&lt;p&gt;None of this is glamorous. It is just the difference between a workflow that feels smooth and one where you keep asking why the upload form hates your image.&lt;/p&gt;

&lt;h2&gt;
  
  
  The simple version
&lt;/h2&gt;

&lt;p&gt;If an image is too large, do not start by crushing the quality.&lt;/p&gt;

&lt;p&gt;Start by asking:&lt;/p&gt;

&lt;p&gt;"Is this image bigger than it needs to be?"&lt;/p&gt;

&lt;p&gt;If yes, resize it first.&lt;/p&gt;

&lt;p&gt;Then compress it.&lt;/p&gt;

&lt;p&gt;That one habit usually gives you smaller files, cleaner images, and fewer weird upload failures.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>webdev</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Extract a Table from a PDF to Excel Without Rebuilding It Manually</title>
      <dc:creator>Subharanjan Sahoo</dc:creator>
      <pubDate>Sat, 20 Jun 2026 09:26:08 +0000</pubDate>
      <link>https://dev.to/sahoo1987/how-to-extract-a-table-from-a-pdf-to-excel-without-rebuilding-it-manually-h9k</link>
      <guid>https://dev.to/sahoo1987/how-to-extract-a-table-from-a-pdf-to-excel-without-rebuilding-it-manually-h9k</guid>
      <description>&lt;p&gt;PDFs are great when a document needs to look the same everywhere. They are less great when the thing you actually need is the table inside the document.&lt;/p&gt;

&lt;p&gt;Maybe it is a price list, a bank statement, a report, an invoice summary, a research appendix, or a batch of exported business data. The information is sitting right there, neatly arranged in rows and columns, but copying and pasting it into Excel turns it into a mess.&lt;/p&gt;

&lt;p&gt;The good news: you do not always need to retype the table manually. In many cases, you can extract a table from a PDF to Excel using a dedicated PDF table extractor, spreadsheet import features, or manual cleanup workflows.&lt;/p&gt;

&lt;p&gt;The slightly less glamorous truth: PDF table extraction is not always perfect. PDFs were designed for presentation, not structured data. A table that looks clean visually may not contain clean row-and-column data underneath. The best method depends on how the PDF was created, how complex the table is, and whether the file is text-based or scanned.&lt;/p&gt;

&lt;p&gt;This guide walks through the practical options.&lt;/p&gt;

&lt;h2&gt;
  
  
  First, Check What Kind of PDF You Have
&lt;/h2&gt;

&lt;p&gt;Before choosing a method, it helps to understand the PDF you are working with.&lt;/p&gt;

&lt;p&gt;A text-based PDF contains selectable text. If you can highlight table text with your cursor, the file probably has extractable text data. These PDFs are usually easier to convert into Excel-friendly tables.&lt;/p&gt;

&lt;p&gt;A scanned PDF is essentially an image of a document. If you cannot select the text, the table may need OCR before it can be extracted reliably. OCR can be helpful, but it introduces another layer where mistakes can happen, especially with small text, faint scans, skewed pages, or dense financial tables.&lt;/p&gt;

&lt;p&gt;There are also hybrid PDFs, where some pages contain selectable text and others are scanned images. These can require more manual review.&lt;/p&gt;

&lt;p&gt;The cleaner the source file, the better your chances of getting a usable Excel table quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 1: Use an Online PDF Table Extractor
&lt;/h2&gt;

&lt;p&gt;For many everyday documents, the fastest route is a dedicated online table extraction tool.&lt;/p&gt;

&lt;p&gt;A PDF table extractor is built to identify tabular data in a PDF and convert it into a format you can open or clean up in Excel. This is usually much faster than copying cells by hand.&lt;/p&gt;

&lt;p&gt;For example, &lt;a href="https://pdfkit.tools/pdf-table-extractor" rel="noopener noreferrer"&gt;PDFKit's PDF Table Extractor&lt;/a&gt; is part of PDFKit, a free online PDF utility platform from Kynlab. PDFKit supports common PDF workflows such as compressing, merging, splitting, converting, rotating, deleting or reordering pages, protecting or unlocking PDFs, extracting text, extracting tables, and filling forms.&lt;/p&gt;

&lt;p&gt;PDFKit describes its core workflows as no-signup and no-watermark, with temporary files intended to be deleted within minutes. That makes it a practical option when you need a quick table extraction without installing desktop software.&lt;/p&gt;

&lt;p&gt;A typical workflow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the PDF table extractor.&lt;/li&gt;
&lt;li&gt;Upload the PDF that contains the table.&lt;/li&gt;
&lt;li&gt;Let the tool detect and extract the table data.&lt;/li&gt;
&lt;li&gt;Download the result.&lt;/li&gt;
&lt;li&gt;Open it in Excel or another spreadsheet editor.&lt;/li&gt;
&lt;li&gt;Review the output before using it in a final report or workflow.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last step matters. Even good extraction tools can misread merged cells, multi-line rows, unusual borders, footnotes, or tables that span multiple pages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 2: Try Excel's Built-In PDF Import
&lt;/h2&gt;

&lt;p&gt;Depending on your version of Excel, you may be able to import table data directly from a PDF.&lt;/p&gt;

&lt;p&gt;This can work well for straightforward reports where tables are clearly structured. It is especially useful if you already live in Excel and want to keep the workflow inside Microsoft's ecosystem.&lt;/p&gt;

&lt;p&gt;The general process is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Excel.&lt;/li&gt;
&lt;li&gt;Use the data import option for PDF files.&lt;/li&gt;
&lt;li&gt;Select the PDF.&lt;/li&gt;
&lt;li&gt;Choose the detected table or page data.&lt;/li&gt;
&lt;li&gt;Load it into a worksheet.&lt;/li&gt;
&lt;li&gt;Clean up formatting, headers, and merged rows as needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This method is convenient, but it can struggle with complex layouts. If the PDF has multiple columns, repeated headers, nested tables, or visual spacing used instead of real table structure, Excel may import the data in a way that still needs cleanup.&lt;/p&gt;

&lt;p&gt;It is worth trying, especially for business reports and exported statements, but it is not always the cleanest route.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 3: Copy and Paste, Then Clean Up
&lt;/h2&gt;

&lt;p&gt;For small tables, manual copy and paste can still be good enough.&lt;/p&gt;

&lt;p&gt;If the table is only a few rows and columns, select the text in the PDF, paste it into Excel, and see what happens. Sometimes Excel will separate columns correctly. Other times, everything lands in one column.&lt;/p&gt;

&lt;p&gt;When that happens, Excel's "Text to Columns" feature can help. You can split data by tabs, spaces, commas, or other delimiters, depending on how the pasted content appears.&lt;/p&gt;

&lt;p&gt;This approach is best for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short tables&lt;/li&gt;
&lt;li&gt;One-time tasks&lt;/li&gt;
&lt;li&gt;PDFs with selectable text&lt;/li&gt;
&lt;li&gt;Simple layouts without merged cells&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is less ideal for long documents, repeated extraction work, or tables where accuracy matters. The risk of small copy errors grows quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 4: Convert the Whole PDF, Then Extract the Table
&lt;/h2&gt;

&lt;p&gt;Another route is to convert the PDF into another editable format first, then pull the table from there.&lt;/p&gt;

&lt;p&gt;For example, you might convert a PDF to a document format, copy the table, and then paste it into Excel. This can help when table extraction tools struggle but the document structure is still readable.&lt;/p&gt;

&lt;p&gt;The downside is that whole-document conversion may preserve the page layout better than the table structure. You may end up with text boxes, line breaks, or visual formatting that still need cleanup.&lt;/p&gt;

&lt;p&gt;This method can be useful when the table is part of a larger document and you also need surrounding context, notes, or labels.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why PDF Tables Break During Extraction
&lt;/h2&gt;

&lt;p&gt;If you have ever extracted a PDF table and wondered why the result looked strange, the answer is usually in how PDFs store content.&lt;/p&gt;

&lt;p&gt;A PDF often does not "know" that a table is a table. It may store text as positioned fragments on a page. Lines, borders, spacing, and columns may be visual instructions rather than structured spreadsheet data.&lt;/p&gt;

&lt;p&gt;That can cause problems such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Columns merging together&lt;/li&gt;
&lt;li&gt;Rows splitting across multiple lines&lt;/li&gt;
&lt;li&gt;Headers repeating in the middle of the data&lt;/li&gt;
&lt;li&gt;Footnotes appearing as table rows&lt;/li&gt;
&lt;li&gt;Numbers losing alignment&lt;/li&gt;
&lt;li&gt;Multi-page tables breaking inconsistently&lt;/li&gt;
&lt;li&gt;Empty cells disappearing&lt;/li&gt;
&lt;li&gt;Merged cells shifting nearby values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why table extraction should be treated as a productivity shortcut, not a magic guarantee. It can save a lot of time, but important data still deserves review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tips for Cleaner PDF-to-Excel Extraction
&lt;/h2&gt;

&lt;p&gt;A few habits can improve the result.&lt;/p&gt;

&lt;p&gt;Use the original digital PDF when possible. A file exported from software is usually easier to parse than a scan or screenshot.&lt;/p&gt;

&lt;p&gt;Avoid low-quality scans. If the PDF is scanned, OCR quality depends heavily on image clarity, alignment, contrast, and text size.&lt;/p&gt;

&lt;p&gt;Extract one table at a time if the document is complex. Large reports with multiple tables, charts, sidebars, and notes can confuse extraction.&lt;/p&gt;

&lt;p&gt;Check headers and totals. These are common places for subtle errors because headers may repeat across pages and totals may be visually separated from the body.&lt;/p&gt;

&lt;p&gt;Watch for merged cells. Excel can handle merged cells, but extraction tools may interpret them differently.&lt;/p&gt;

&lt;p&gt;Validate numeric columns. If the table includes currency, percentages, dates, or IDs, check that Excel has not reformatted values unexpectedly.&lt;/p&gt;

&lt;p&gt;Keep the PDF open during cleanup. Comparing the spreadsheet with the original document is often the fastest way to catch misaligned rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  When an Online Tool Is a Good Fit
&lt;/h2&gt;

&lt;p&gt;An online PDF table extractor is usually a good fit when you need a quick, lightweight workflow and the PDF is not highly sensitive or unusually complex.&lt;/p&gt;

&lt;p&gt;It is especially useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pulling tables from reports&lt;/li&gt;
&lt;li&gt;Extracting invoice or statement summaries&lt;/li&gt;
&lt;li&gt;Converting simple PDF tables into spreadsheet data&lt;/li&gt;
&lt;li&gt;Avoiding manual retyping&lt;/li&gt;
&lt;li&gt;Handling occasional PDF-to-Excel tasks without installing software&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PDFKit is one option here, especially if you are already using browser-based PDF utilities. Its broader toolkit also covers related tasks like merging, splitting, compressing, converting, rotating, extracting text, and filling forms.&lt;/p&gt;

&lt;p&gt;For more complex or sensitive workflows, you may prefer a desktop tool, an internal document-processing pipeline, or a spreadsheet workflow that keeps files inside your organization's approved environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  When You May Need a Different Approach
&lt;/h2&gt;

&lt;p&gt;Not every PDF table is a good candidate for quick extraction.&lt;/p&gt;

&lt;p&gt;You may need a more specialized workflow if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The PDF is scanned and OCR quality is poor&lt;/li&gt;
&lt;li&gt;The table has many merged cells&lt;/li&gt;
&lt;li&gt;The document uses rotated text or unusual layouts&lt;/li&gt;
&lt;li&gt;Tables span many pages with changing headers&lt;/li&gt;
&lt;li&gt;The data requires audit-grade accuracy&lt;/li&gt;
&lt;li&gt;You need to process hundreds or thousands of files repeatedly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In those cases, extraction can still be possible, but you may need OCR, manual review, scripting, or a more controlled data-processing system.&lt;/p&gt;

&lt;p&gt;The right tool depends on the job. A quick online extractor is excellent for convenience. A spreadsheet import feature may be enough for simple business PDFs. A custom workflow may be better for high-volume or high-accuracy use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Learning how to extract an Excel table from a PDF is mostly about choosing the right level of effort.&lt;/p&gt;

&lt;p&gt;For simple, text-based PDFs, a tool like &lt;a href="https://pdfkit.tools/pdf-table-extractor" rel="noopener noreferrer"&gt;PDFKit's PDF Table Extractor&lt;/a&gt; can get you from PDF to spreadsheet-ready data quickly. For small tables, copy and paste may be enough. For complex reports, scanned documents, or high-stakes data, expect to spend time reviewing and cleaning the output.&lt;/p&gt;

&lt;p&gt;The goal is not just to extract the table. The goal is to get data you can trust enough to use.&lt;/p&gt;

&lt;p&gt;A good PDF-to-Excel workflow saves time, but the final check is still yours.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>pdf</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
