Few new things in Python which I learned last week.

#python #machinelearning #dataengineering

Chunkify Huge List into Smaller N equal size lists.

In order to backfill data for one of our machine learning pipeline I have to divide the date list into small n list of equal length and distribute them at n GPU cluster.

from datetime import timedelta,date,datetime
start_dt = date(2023,1,1)
end_dt = date(2023,12,31)
cdays = []

while start_dt < end_dt:
    cdays.append(start_dt)
    start_dt += timedelta(days=3)

#print(cdays)

def split(a, n):
    k, m = divmod(len(a), n)
    return (a[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))

split_list = list(split(cdays, 15))
print(split_list[0])

#Output
[datetime.date(2023, 1, 1), datetime.date(2023, 1, 4), datetime.date(2023, 1, 7), datetime.date(2023, 1, 10), datetime.date(2023, 1, 13), datetime.date(2023, 1, 16), datetime.date(2023, 1, 19), datetime.date(2023, 1, 22), datetime.date(2023, 1, 25)]

__all__ in Python

The __all__ represents name of variable ,function or module that you want to expose to wildcards import. This method really comes handy when large number of function in your base module and you just want to export only few of them. Example let's say you have.


#foo.py

waz = 3
bar = 9

def baz():
    return 'baz'

__all__ = ['bar','baz']

and now bar.py imports from foo.py

#bar.py

from foo import *

print(bar)

print(baz)

print(waz) # This will trigger the exception , as waz was is not exported by the module.

Conclusion

Hope this was easy and clear to understand.

Some Final Words
If this blog was helpful and you wish to show a little support, you could:

👍 300 times for this story
Follow me on LinkedIn: https://www.linkedin.com/in/raju-n-203b2115/

These actions really really really help me out, and are much appreciated!

Top comments (1)

Joseph • May 30 '24

Excellent breakdown @rajun !

Using Python to chunk large datasets for GPU processing optimizes machine learning pipelines, and leveraging all keeps your codebase clean and manageable by controlling module exports

Forem

Few new things in Python which I learned last week.

Top comments (1)

Read next

👋🏻Goodbye Power BI! 📊 In 2025 Build AI/ML Dashboards Entirely Within Python 🤖

Dockerfile for a Python application

Building a Chess Game with Python and OpenAI

Part 9: Building Your Own AI - Natural Language Processing (NLP) for Language Understanding