DEV Community

Jesse Houwing for Xebia Microsoft Services

Posted on • Originally published at jessehouwing.net on

1

Package Feeds consuming most data in Azure DevOps Server

The Collection database of a client saw some explosive growth and I was asked to figure out where the storage had gone to.

Package Feeds consuming most data in Azure DevOps Server

You'll find a few SQL statements floating around that can help you with these types of questions, but these don't list the Package Feed data usage. After some spelunking in the my local server installation I cobbled together the following statement to dig a little bit deeper:

select 
   [f].FeedName, 
   sum(cast([list].BlockFileLength as decimal(38)))/1024.0/1024.0 AS SizeInMb
from
   BlobStore.tbl_Blob [blob]
   join BlobStore.tbl_Block [block] on [blob].BlobId = [block].BlobId
   join BlobStore.tbl_BlockList [list] on [list].BlobId = [block].BlobId
   join [Feed].[tbl_PackageVersionIndex] [fd] on '0x'+[fd].StorageId = CONVERT(varchar(max),blob.BlobId ,1) 
   join [Feed].[tbl_Feed] [f] on [fd].FeedId = [f].FeedId
   join [Feed].[tbl_PackageIndex] [p] on [p].PackageId = [fd].PackageId
group by
  [f].FeedName
order by 
  SizeInMb desc

select 
   [f].FeedName, 
   [p].PackageName, 
   sum(cast([list].BlockFileLength as decimal(38)))/1024.0/1024.0 AS SizeInMb,
   count(distinct [pvi].PackageVersionId) as [Versions]
from
   BlobStore.tbl_Blob [blob]
   join BlobStore.tbl_Block [block] on [blob].BlobId = [block].BlobId
   join BlobStore.tbl_BlockList [list] on [list].BlobId = [block].BlobId
   join [Feed].[tbl_PackageVersionIndex] [fd] on '0x'+[fd].StorageId = CONVERT(varchar(max),blob.BlobId ,1) 
   join [Feed].[tbl_Feed] [f] on [fd].FeedId = [f].FeedId
   join [Feed].[tbl_PackageIndex] [p] on [p].PackageId = [fd].PackageId
   join [Feed].[tbl_PackageVersionIndex] [pvi] on [pvi].PackageId = [p].PackageId
group by
  [f].FeedName, [p].PackageName
order by 
  SizeInMb desc

The outcome is a list of feeds and their total consumption as well as a list of feeds decomposed by the different packages in that feed.

Package Feeds consuming most data in Azure DevOps Server
Data usage per feed and per package

From here further exploration should be a piece of cake!

Photo used under creative commons.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more →

Top comments (0)

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more