There are times when I need to investigate existing databases for ETL (Extract, Transform, Load) requirements. This research primarily involves just looking at the data to identify the implicit relationships that might exist between tables that aren't always captured within the database schema itself.
Two of the tools I rely upon for doing this are the Django inspectdb
command to create a set of models that I can then browse through using the Django admin app, and the Django Extensions graph_models
command to help me visualize these relationships. (Frequently I'll create my ETL process by creating a Django command, but that's perhaps a different post.)
One of the features I rely upon in the Django admin - and probably the single biggest reason why I rely upon Django for this type of work - is the ability to create links for foreign keys to jump from model to model. Having that ability to just click through from row to row across tables saves me a tremendous amount of time.
Recently, I've been working on a data-migration project involving a database containing more than 200 tables spread across 3 different PostgreSQL schemas and a total of about 500GB of data. Additionally, the account available for me to access that database is strictly read-only.
This means that, at a minimum, I'm working with three different databases - the source database mentioned above, the target database, and the Django database - with the source and target databases occupying different schemas. (Handling that could also be a whole post by itself.)
When using the Django admin, one of the general requirements is to register your model classes with the admin - and, when necessary, customizing your ModelAdmin class to make it more useful for you. This becomes tedious when you're talking about 200 models that were generated by inspectdb
. Having gone through the effort to clean up the models, I wasn't about to try to create 200 ModelAdmin classes.
So my first step was to find a way to automatically register all the models in an app. Django makes that easy enough:
admin.site.register(apps.all_models['ptcs'].values())
Every model in the 'ptcs' app is now registered.
This would work great, except the foreign keys all show up as "ModelName object (id)" in a drop-down list. Not only is that not particularly informational, but when you have 10,000 rows in a table being referenced, it really slows down rendering the page. The listing pages for the models also show this brief version which give you no clues at all as to the details contained within it.
This creates two problems then that needed to be solved -
Automatically create an
__str__
method for every model to generate some type of informational representation of the model.Change the rendering of the foreign key fields from a drop-down to a link.
Automatically generating an __str__
method for a model raises the question: "What fields to use?"
I took a sledgehammer to this issue and decided that the first 20 characters of the first four character fields in the model would be concatenated to the id field to make the representation. I had no idea what I was going to get, but I figured it would be a good start. (I figured that without any real context, other field types weren't going to provide useful information at first.) This is what I came up with.
[Disclaimer - folks, please don't do this at home, work, school, or anywhere else where some innocent person would want to gouge out their eyes after seeing this.]
from django.apps import apps
from django.contrib import admin
from django.db.models import fields
from django.urls import reverse
from django.utils.safestring import mark_safe
class MyModelAdmin(admin.ModelAdmin):
fixed_field_names = {
'ptcsuser': ['username', 'firstname', 'lastname'],
}
def get_url_factory(self, self_class, fk_field):
def get_url(instance):
ref_meta = getattr(self_class, fk_field).field.related_model._meta
fk = getattr(instance, fk_field)
return mark_safe('<a href="%s">%s</a>' % (
reverse(
'admin:{0}_{1}_change'.format(ref_meta.app_label, ref_meta.model_name),
args=(fk.pk,)
), fk)
)
get_url.short_description = fk_field
return get_url
@staticmethod
def make_str(model, field_list):
def print_fields(self):
field_values = [self.pk]
for field_name in field_list:
field_value = getattr(self, field_name, " ")
if field_value:
field_values.append(field_value[:20])
else:
field_values.append(" ")
return(" : ".join(["%s"]*len(field_values))
% tuple(field_values))
return print_fields
def __init__(self, model, admin_site):
self.exclude = [str(f).split('.')[-1]
for f in model._meta.fields
if isinstance(f, fields.related.ForeignKey)
]
self.readonly_fields = []
for fk_field in self.exclude:
fk_function = 'get_url_for_'+fk_field
setattr(self, fk_function, self.get_url_factory(model, fk_field))
self.readonly_fields.append(fk_function)
super().__init__(model, admin_site)
if model._meta.model_name in self.fixed_field_names:
self.char_fields = self.fixed_field_names[model._meta.model_name]
else:
self.char_fields = [str(f).split('.')[-1]
for f in model._meta.fields
if isinstance(f, fields.CharField)
][:4]
if self.char_fields:
model.__str__ = self.make_str(model, self.char_fields)
# Register your models here.
admin.site.register(apps.all_models['ptcs'].values(), MyModelAdmin)
Starting from the bottom (which is where this actually starts), I've changed the admin.site.register call to specify my custom MyModelAdmin class.
The __init__
method starts by building the list of fields to be excluded from the automatic field generation process.
self.exclude = [str(f).split('.')[-1]
for f in model._meta.fields
if isinstance(f, fields.related.ForeignKey)
]
This list comprehension iterates through all the fields defined in the model, identifying the fields that are ForeignKey fields. The actual field name is something like app.ModelName.field_name, and so I just want that last component -> everything after the last period.
Since I'm going to be creating the FK fields as links, I don't want the select boxes built. Next, I take that list of fields and generate a list of function calls named "get_url_for_" + the name of the FK field.
self.readonly_fields = []
for fk_field_name in self.exclude:
fk_function_name = 'get_url_for_'+fk_field_name
setattr(self, fk_function_name, self.get_url_factory(model, fk_field_name))
self.readonly_fields.append(fk_function)
This iterates through the exclude
list created above. It creates a name for the function ('get_url_for_'+fk_field_name), then calls the get_url_factory method to create the function, and assigns that function to the current MyModelAdmin instance with the created name.
(e.g. if the field name is manager
, then it would create a function named get_url_for_manager
in this ModelAdmin instance.)
The get_url_factory method creates a function to build the Django-admin-style url from the FK referenced model and instance when rendering the detail view. Since this field is just a link to the real instance, it's defined as a read-only field in the manager.
def get_url_factory(self, self_class, fk_field):
def get_url(instance):
ref_meta = getattr(self_class, fk_field).field.related_model._meta
fk = getattr(instance, fk_field)
return mark_safe('<a href="%s">%s</a>' % (
reverse(
'admin:{0}_{1}_change'.format(ref_meta.app_label, ref_meta.model_name),
args=(fk.pk,)
), fk)
)
get_url.short_description = fk_field
return get_url
Once that's done, the last thing to do is to create the __str__
method for the model.
if model._meta.model_name in self.fixed_field_names:
self.char_fields = self.fixed_field_names[model._meta.model_name]
else:
self.char_fields = [str(f).split('.')[-1]
for f in model._meta.fields
if isinstance(f, fields.CharField)
][:4]
if self.char_fields:
model.__str__ = self.make_str(model, self.char_fields)
If you noticed at the top of the first listing, I've defined a dictionary that maps a table name to a list of field names. Once I've started to gather some useful information about these tables, I can manually identify which fields I want to use in the __str__
method (the first if
statement).
If I haven't defined an entry for the table, I build a list of the CharFields in that model and take the first four.
If I have anything in that char_fields list, I then call my make_str function to create the __str__
method on the model. (There are two cases where char_fields is an empty list - either I've defined it as empty in the fixed_field_names dictionary, or there are no CharFields in the table.)
@staticmethod
def make_str(model, field_list):
def print_fields(self):
field_values = [self.pk]
for field_name in field_list:
field_value = getattr(self, field_name, " ")
if field_value:
field_values.append(field_value[:20])
else:
field_values.append(" ")
return(" : ".join(["%s"]*len(field_values))
% tuple(field_values))
return print_fields
This function takes the list of fields and creates a function that builds the list of values to be displayed - the pk and the first 20 characters of each field in field_list. It joins them all together and returns the resulting string.
Voila!
Footnote: The sharp-eyed among you may ask why I didn't just use the admin_generator
command in the Django extensions module. I tried that at first - but it generates one admin model per data model, and I realized very quickly that I would end up needing to manually tweak 100+ tables to get it to be usable for me. This ended up being a lot less work overall. (In real terms, less work doing that than writing this blog post!)
Top comments (3)
Excellent explanations.
I ran into the same problem of populating Django admin for a legacy database. I ended up using
mixin
andtype
can populate the entire admin with relevant columns.I like this a lot. Great explanations and idea.
I could see building off of it to populate
raw_id_fields
by default too.mannn this is mad..
Loved it